À quelle vitesse un systĂšme STT peut-il ĂȘtre fabriquĂ©?







( — ) , . , , , state-of-the-art bleeding edge , , , " ". , GPU, , , ( 2 — 2.5 GHz AMD 4+ GHz 64 ). , , !







3 :







  • ;
  • , ;
  • SLA ;
  • " " ;


:







  • ?
  • ?
  • ?




. STT , Facebook AI Research , , . , " ".







, 3 , "":







  • Throughput ( ) — . "";
  • Real Time Factor (RTF) ( ) — , . Real Time Speed (RTS) 1 / RTF, , 1 ;
  • Latency () — - ;


FAIR, :







  • FAIR Intel Skylake c 18 ( 2 , - );
  • end-to-end C++ "" ;
  • wav2letter++;
  • , - ;








:







  • "" (RTS * ) . , ;
  • FAIR , .. , TDS-;
  • - ;
  • , , - state-of-the-art ( , );
  • FAIR , "" 40 , 0.26 RTF ( ). , , ;
  • 40 * 1 / 0.26 18 . , 1 1 - 8-9 ;


:







  • : 750 ( );
  • : 8-9 , 40 18 ;
  • : 500 1000 , 750 — ;
  • C++ -;




— ? , .. . . , - . — , , .









end-to-end , FAIR, , . , , end-2-end ( ).







FAIR, .. , C++ . , .







. , , . , " " "" ( ). ( ):







FP32 FP32 + Fused FP32 + INT8 FP32 Fused + INT8 Full INT8 + Fused New Best
1 7.7 8.8 8.8 9.1 11.0 22.6
5 11.8 13.6 13.6 15.6 17.5 29.8
10 12.8 14.6 14.6 16.7 18.0 29.3
25 12.9 14.9 14.9 17.9 18.7 29.8


, , . , CPU . . FAIR , , :







  • -;
  • , , , ;
  • FAIR , ;








, - .









- .







:







  • 1 — 7 , "" 4 — 8 — 16 ;
  • ;
  • -, . 8 latency 500 , , -, , 8 ;
  • , . . -;


GPU



SSD, 256+ GB NVME, 256+ GB
RAM 32 GB 32 GB
8+ 12+
3 GHz+ 3.5 GHz+
2
AVX2
GPU 1 1
8 "" 16 ""
, 280 320
95- , 430 476
99- , 520 592
1000 25.0 43.4
500 12.5 21.7
(1 / RTF) 85.6 145.0
10.7 12.1


3 GPU:







  • GPU Nvidia 1070 8+GB RAM ;
  • GPU Nvidia Quadro 8+GB RAM (TDP 100 — 150W) ;
  • Nvidia Tesla T4, , TDP 75W;


, GPU ( " "). , :







  • " (300 ) ". . TDP . TDP 75 150 , - 50-75% ;
  • " ". , 2 (+ );
  • " ". "" Tesla SLA Nvidia. SLA , .. Tesla 2-3 . "" — ;
  • " 2+ ". . Quadro T4 1 ;
  • " ". Tesla A100 US$12,500 . Quadro T4 ( ) ;
  • " ". "" — 3-4 , . — . , Nvidia 24/7 "" , . 100 Nvidia AMD 3 ;


— . — GPU 2-3 .







CPU



SSD, 256+ GB SSD, 256+ GB
RAM 32 GB 32 GB
8+ 12+
3.5 GHz+ 3.5 GHz+
2
AVX2
4 "" 8 ""
, 320 470
95- , 580 760
99- , 720 890
1000 11.1 15.9
500 5.6 8.0
(1 / RTF) 37.0 53.0
4.6 4.4


C



  • GPU 10 — 15 RTS ( RTS GPU - 500 — 1,000). CPU 1 GPU ( ), , . - ;
  • CPU- 5 RTS, , latency throughput;
  • . — , - ;
  • 50 , 2 GPU (+ ) ;
  • GPU - 2-3 , CPU;




FAIR , 50% . — . 20-30 RTS , - 40-50% . :







  • GPU , ;
  • FAIR;
  • , GPU , ;
  • - , ;


? MIT.








All Articles