Senior AI and Deep Learning Architect – Model Compression and Quantization
Job description
- Research and develop state of art model compression techniques including model distillation, pruning, quantization, model binarization, and others for CNN, RNN, LSTM models.
- Implementing novel deep neural network architectures and develop advanced training algorithm to support model structure training, auto pruning and low-bit quantization.
- Apply and optimize model compression technique to variety of models in computer vision applications, audio applications, and others.
- Research and optimize model compression technique for Kneron CNN accelerator and jointly optimize hardware architecture for compressed model.
Requirements
- M.S./PhD in Computer Science, Machine Learning, Mathematics or similar field (Ph.D. is preferred)
- .3+ years of industry/academia experience with deep learning algorithm development and optimization.
- 3-5 years of software engineering experience in an academic or industrial setting.
- Holistic understanding of deep learning concepts, state of the art in model compression research and the mathematics of machine learning.
- Solid understanding of CNN, RNN, LSTM, variety of training method, learning rate choice, hyper-parameter tuning.
- Research experience on any model compression technique including model distillation, pruning, quantization, model binarization.
- Strong experience in C/C++ programing.
- Hands-on experience in computer vision and deep learning frameworks, e.g., OpenCV, Tensorflow, Keras, Pytorch, and Caffe.
- Experience on hardware architecture design is a plus.
- Ability to quickly adapt to new situations, learn new technologies, and collaborate and communicate effectively.
- Experience with parallel computing, GPU/CUDA, DSP, and OpenCL programming is a plus.
- Top-tier conference publication records, including but not limited to CVPR, ICCV, ECCV, NIPS, ICML, is a strong plus.
Location
Taipei/Hsinchu/USA_San Diego/Shenzhen/Zhuhai