# Prepare a GenAI model using AI Hub

Qualcomm AI Hub provides a streamlined workflow to prepare and deploy
large language models (LLMs) on Qualcomm Dragonwing™ products using
Qualcomm GenerativeAI Inference Extensions (Genie).

This approach enables efficient on-device execution of generative AI
models by leveraging the neural processing unit (NPU) and optimized binaries.

The following image shows the high-level GenAI model workflow from preparation to
execution.

![../_images/genai-prepare-ai-hub.png](data:image/png;base64,UklGRuweAABXRUJQVlA4TN8eAAAvcItmEN/iJpIkR8o9/uAOwJlnqReW/i8abiNJUqS8uwf/PXsLXmSUGMqJZNtO1IT9bw2NwuOSCjPHcSTbStU7IiAisiYMAmDl7v4l4P1+/pWCiCAS8S+iUpBCCeJflCCS/hFBKRJRkIIUFRH6pxIECeL/+wtEFImoBJEIDHjCGPylZzAYSVwoGPDvMzS0DcNrEPgxzYZvvyG1LPm/ijE2u6PPZw7vtz37fj/b5rBuN7//z+V4Wdar3+/ncX/Mq4VzHLeN5EhS5Z/1mOrdM++ImAB6vye1SnvBgQpgSBvo4LRVoPhHQSXcfkKS7HKtmVaTbKjTKsu150Q3Jh65SGJUCZpEPfnGXDoZy9xy+YiGRdnf/XyGlekHGBFbwLVR3HOsZpn0W4KHbdscp7WtDYkxLGENaHLOQeBs0jDtnCOZmef//yNdTyN5MtU1/eiYRd1PpYj+w4Jt20mzDnZNkCnvxYePtC+JbSNJkuTLGXX+G7NfxmRX1mgf7Xbv3EX0Xw4kSYok+TPTVEX9a7+iBUeSw0iSsF9cHmqJ+/9LtsmMiKRKUvVMCzlJ5F1E/yVBkiS3TdHySTt6Bjs7jSUEmLBfAx1y8fXj+zfPc3N03Kn9KCcD3ef881uYqfPus+3S91cwV8fGbhy/zs3WkeOOfpRG6NHCWOtOzRwd34opgOZLB5/IuDxbbDoTdRIiqdWHLg5rDWeqTkDH4Qu5drJmss4kufJH+aORwx81s3UiGY6p16WDCXsQ2u83CqVbzdY3AsA3ETl/qT5azYQdFF5Ykc/quoYRW4gAkshZYdGZsYNClIu80DRkSwBw8VXtR96ZshMK8rGwYMyWCh/eF8aM2WLBnhZaxmy+QH6OszvGbB7AcxRq5uwAwK3/9J/+03/679Z/+k//6T/9p//0n/7Tf/pP/+k//af/9J/+03/6T//pv/Ubnl9tP3hymP+yR+yDdgrD/WE3Kw4TFzjUXFtHFNd2cyaIX4v9dhGelwsccK6tY2muH4IVIi5p1vjK1v3HLLDkSbJbcVyzmuvCzgUOO9fWYdSXjsAOSRK0aXC6/Sc3LHO+HZM2BTli6AKHnmvrGGb3wRJJosu/z/chWCKSVo+d4iTh/GXu5/95AI0doGx7fiq7O1j7ZU+SLMxv45rsNDRoZhdskd/vtbtZcZi4wAHo2upupFzVOdicqDsWSEA8KFd2RrS3tr+XM0R0Ud25QXFYucAh6NrqbQqglpNjhCSPUqY1ubbH/kYcbf3OAQuc+NG4kaCD6k66ONxc4CB0bXW2mFMbQ44ZAl+aIRd1JpYXz7WMGxa6DBqo7mSLwxcFzpAzh8XQ9RVB7GU1hkgWQDKlv0ObJf65Ojwqf0gWhzUKnAlnjoyh62qEjt7DoRpLBA/pXDmiKyit7WcdU2Qu6HGpezjEHAXOfDPHxtD11NglNh1bJNItaIOeBE88GHJsEYgOl7rIHwXOcDOHx9C1tFOe47ljng96EqA8dawRgco9nS11mwxS4Mw2c3wMXUeztEWHN+o6tF1HS1t7WHreMUfoWVvNQsXh+QLnOp8Vh5Ch66e+T7adHeINIFvQprqG+PIyxyFLna9moeKwSIEz2cwhMnTtLEHJuKOuk0FF9PT/1gbMOfbILFRSJcsUh0kKnLlmjpGh66Z5RHrAdOwRDwCpqZ2u9Q/JZ3UMEvJJJVWxSHHYpMCZauYoGbpm1qEM8QeQ2XNdTx3Atrjk//INiFUsUhzGLnBIuraaEbJHVccg8QAgWnrBdI1P/kWW0a5iieLwSYEz1MxhMnS9jAHAwSCHSFBzaNTR6j7jEVll13YSxeHsAgela6uVNdoJJoskauhl6HHy5mfHJEkAYDXUWPCPlOCMNHOgDF0ru8oEj4CG+mRbIW/Y4RJeiRUsUBy0g4PStdXJMJTAIwJUklZsAcAUlwiKrWCB4qAdHJaurUbmAWCbS+o620rQz54LMy4xWtX7w6sXB5XgTDRzqAxdI6tkGeYSQUla8US5yyVEsRWsXhy2g8PStdVIm19a89v6eeRvXMJW9QPrxWE7OCxdW3Fc1Jqf6ac1n/7qKv63zR3wkNaLw3ZwWLq24nipNb/fql4ctoPD0rXVyKEyyCUSAEi/Vb04bAeHpWsrjvnrOv2Nhe4DZXD0n/7Tf/pP/+k//af/9J/+03/6T//pP/2n//Sf/tN/+k//6T/9p//0n/7Tf/pP/+k//af/9J/+03/679Z/+k//6T/9d+s//af/9J/+u/XfZ2J//beIWN0RFSMf+48gIhLsVe9X7oun3snNPgNjKfsPuTb2qvfrvZ3cbA3aUCp0ONfrW49Y4qqDG34GRlJBusk/ep39L2LNj7q+K33MIrbnh6sbvFAwXVfG0guhVLHvPCKhl4d/XN2kzs/ARIo+4qq7Uenl3KiOMZAKy9TD78ZTNeOB4qT7f4rGy9KPP3ympgYn7WOwW4xXwKxQr9pT0eBFMyNUnd60iBz1urE+e4LA6OLliSqmCLXgnvZQXGYMevhr8gQhsQFhYWpnTrUyKQv6ZOPqH9TsaDq4Lo8dpp1T1+yhbHheEzRg3gqFiPcEwlT8uEfQjrbNsE66ExShq3ilo4xCzlyHd/QuuGYPZaM8rwlaGLQdygan/JkP/VGJHw2BnXYQUhuCkKWhxa/jmz3CvnteE7QjsLCSwn88o+eXuH1EM469zjKxr5AZswhCie3k5zhqDiPLh5jjN9e254PUq6aEfD1MVmVv0thoZolf8dcUDpvWIHYZwt5KD3yMdlamkFcNTkYWsXBc8HTikEKr1OFBceF08ikWrtljYgxtx6exMvBx7R0aeZj1Z0JGQ42Nf2HHbH0hlRThdJHX9SDmcEDEMs7Zy4JBeW0tsFNAZB3tMhTO+oMghOWljXtgf90SR1OMkw9CaDjAHIXtgm/2sqmbRMg0QSdkCuHEIx8+2igphBcHVleQPhpbxnJ7fPX12Jim4pc9IlclUPHvo2yIvFGq+Pj2UyDssjjy3zydUULAhJEqwhJ3qDhnj1BVrptPSOsA2CYPhWCvzXjXA7EPjUP23gaQ9IdD+iiNMqciTYfZNTgoDtkTieOQdm6eDkkA8M0HIlypsN5psQse2VtHhHHZoCdK9vsD3HwuYx+K8bme2ME8LGIb1jekevrjkb01ROazLcTP9hge9b7UHP+UgCdFRqt8nkxL6UZI8Q5I3y9L1JMiMXCaPx2Z6P2/x9v+dE6SHhUB5jhFPf1EyDGeRdShCOIKnj93f1tkZoanQg4V7WVt286Z08UwumRq9OIL0LPnCdhpWEl+O8BVwycOFT0+m/k+2QKoAaFE2uFIH2IZSvaWkSvZv/8SPdqN5nzypsu68j7hmd+PhhhKjfb/xNEXyH8fArDefFjk7B0WcYX/btN0VO03SvGLgEj6cUI70bebDJq6pYy7Vp/NfE+WYhDeNBix7RDY0leQPujZOwThSQyF+AmCfqn9MiXGIzdvN52m7it5LznI7hvrE3hTytiuJZgD2BXo+MMJPXtLwHlNT4oQHR8Z1Hd+qRK3m2SpvG0wHg/qiA9nvs9kfXkyth+Z1cVkEWEqsB658AtlGKfsLaHDCMqVHUT2PkMy66LpsKEYF2I6IO41+0ZT/GTcSZCQ2J8Jil+ujoWMHhnLWE0WCrM5sIHA6smRDaJnzxVWcmT60hLuYHxq9DY9Yh7/9WuAaszjBGwQvpG7N5zOqrqTfFrzfc2OefAts3Gpzb7PduNdui/FfrPtMYfpwSwxpnfhjhXXnL4wHZOcPW9Q5sLaTENeRfBm7pNL/Nww63aqgfg1wDnaoDtN7YO6lXyK//IK6p0kWQn7eqcGH34hAmZlmEXU7K2icPwu0OF/esAXh35ThB+O9X3KQd4Wrkric9OWUyiS5sSGYgtbaAo1ew74bKg+BhQyfPbic4MYSt89XP6Np9HvVV84HvQu8onN9wEIqi472g24PcPuTqq92IFsOlbdGcTs+VL4BcScDhFqP+FZGTfD5eucmSvgu4fM17c0dP8e5Qwo/3CO4r05pWkxqV6AJViIt3MB8tZkXP5r6g1IlWIkdZIc6Map3sz5Pvc0hD4aehwyU0F7B+IhNqziOnz2KvpYPMtUv7ojNkLm65xpyc24MU/2aiugKOzqdUO8gmhQTZp+AsvnppL5ulb4KhVFHQDFV+qVe9PY83jkZleKEM/yWt0UNUHryvJiXqxjBc9pm+LxChDJIpIXpqgMIUZZJz742Mzvyhun+nHM99e3cnhBh34se+rOmOloJqFlNbUVMFR2NbFzQYg2e0bYXZNsUEKbTCUh8EFwuyfzdU2MVlo8+Cu0wcuYFdPWMNaeskFmTbv5ykEutGy3DDAjozkzDIS3ttLEY0gJAqr2o+PjS5k9khE+QPmeqmqpLT5GXqcTdIKKAGW8rny+R1/m6xThWpAyn5faCggyu5peLCmpfRXRR6U2WU3CbSXzdUVs0qJnE6oU1jQS1Vaj+UoMjLnuGNaAZMYHMw7Us0a0hA9ITzwLKjjf1FRDlluAOHnqDUilYxR0OuB6ggpit+PS53v0Zb5Ol42dCzGdq60Ak9nVo3iif4bQvnYj26jQvapJuJ28rI33uhLYEmnEi0TO1qgWo/pK+fbrbphpLHDLIvw46PGvhApm38s8cXJ0hphl7T3LcoIKp7CsaR73CZC6S6il3oBUNkZZJ7xxqjbkKZhjErnpFE0a27XP9wBZ/NSdIYzFIOZz8/zgaiuAZHZ1Yzuh1L4IJbG1NllOAnfDbDdR5usqgOVTqJi2VqWs+5sKQoaxQTayr1Rvv+5+EcpkT1IaM8eCTV+TDHNmRIMX0qhJufsAouXNd+3tFSCZHJxaKIOIUdbJdTsTmCC43TCB9r6vfr5HX+brDMFTLJyQkQqxFUBCu5r06iW0L65nqN0rloQyPdAuynxdDXsc9GgPVaxSILij8ggVygaZ6L5SvP26G2Z2Gi6KgnS5BadgUxZlyQuu3vtRpl6QjbSidQrQJzkNxhBMjKpON6wwQL0jCm7365/v0Zf5Om2CczLh2gqu0K4ObmUq6PbFKimxe1WTcHPJfF0DcKgit+JSpbvYwluK7CsRBUV/7xdrHMGxE+wcG1swpqwLDjT54VmYR482mInfMcAkSiH53DU6RkGnE2IOCtUdsVdxA+Z79GW+zg1m+gfzL+4daitAZqS8q9kVSuylCgTH9ie1e1WTcHfJfJ0/TMGA/VO1SkEBYS+zFDdfOdYpkdn9IsxpjZxjOzWYkQ0lhqAgaxDtBaoQOlrXAJnR7H8DUkWdMpRvB1L47Vw3gfP8i77M1xkT7ApJU1sBMCPRXR0oI0R1A7Pu2UWLv491dsj1+edP5KU8Ln8ih5L5OlW4RxHVZe+e+lVpZwp8Ket8Zd00jO1PbEAAtQBvCAO7fzxFG3YvqBJiZC0LcCB4GybFqKRJBicI6+JTFIdgjH1PjFmunfOlWC8faisApHZ1UZ3SsT/VwKruFadHD6XTcJNkvs4eomDtqpSrFHtEuzCXovtKoT/eNKzxASctA2xIxAUS2bZSolewaZYFGG8IipepnWMzyL6HxiBTd56IvaOEVsCQ2tWbWpN3R+5ey6aXvC4AYK3NAnCp0hwhXt5fioOvvJlY25DXo01eAXf5mgIvyxV5xcQ/QLBa817YmjjZIrGaUfY9Lf9p7HGunX/Ol6ImWLUVAOq7mt9F6CKdYUH3MpXcZTL/uQqapacG1CoVR/RSHHylxrlfzFcxwOsmxHwbCHt0+JAMtY5gWRfggP8NSOV1epImUOhN4oRf5j8nSlCUxmZI7epdxEqJP0r3WriOK+jLn8hLeaSfyBH99z8n/sYXqJ0Bc4yqVWpZj6Ph4CtvJs3amQOjgSVInAw59TgcSoDejJ5dhtC5ir2C05rvCfXzcw1z7ZwdYUM9VnK9EWim5r6EP0r3Wr7eye2HWy/l/+8/f04UWVwho6JU6dsQd5Wu14lg29KQNnwt/X0oNyLTmcg8AFiYyI2Y75F5HWImmbr//flzBbQD8NJPnBEx5ONwPFauevOi281f+0poU/05Ue8DFnY3If1lXTD+NyCV1/kO3Ib5HprXF26SqftzksR7YfZKK56PDrrdnD2kbdmfNaxhv25i1Ggp4+F1vgO3Yb6HxiBTd4bU24G6qfHjjw663VzSV/FZA6flNm3vg/40VxVHQg7QibnuHOB1vsNP34T5HhtjTN2JoXmIxa3gCuxqIowhHSuQu9dWo9rN9WDs8Tq5i3eBKcU7yjwtGmIoSsEcRCcOqE4XgM5F9GXcgvkeHENM3UnBqjkejxq7Wc7hfHYQ7eaCqJnyox2Yus/YU6ixuXk6ObLMly4p7DG2HjsGAv8AQULtRuKDrXMpjcyyyA2Y79ExwtSdIWYbsn3vglYAqO9q3E6nlrACu3t9ELz+/mtfCsDXjYYNF4RGPxoOvvJugjZpDFoA2mbtFuFCHIV0IFSEikx3chFKrsuFy5/v4THA1J0YnJgE/cu3FTDUdzXUEXvIxmri9mO/6YvTxuOjJv8H60g0i0dD85W3FLDVpm0HERYbRplwcpNsbS0GpoPAM0BqML8Xdjfg0qJy8fM9PpY/dSdKS1vMZGX8WwEgs6ttoYOavQS7e919rHZzIVDVZSmUqlQUNbyvlqU4+Mq7CRqYaL6ibL+8AtIYwGR/DIJuA8sCVHcb3AlPdqdz0FSufL5H32vvUwTNtwlNj3srQCIiu5rooOUO00VvPnO7uRyQj7OKaXGVrkX3lbeVbZ0kCiPMfUn41vKRZcVVbgTzChCDY4ApyEyiEmgEnT4wBY2yonLd8/0OadHjuPFsvtfe50pLOB8XuR4otwJEeVdTYWfGGsQ2udnGdnM54Fo0y1KtUiSKGPKsV690QfeV95VuTI80YYwEdjLMcWjDQALLAmGVB0S0/LqLU4BYVBP3RoqH1wkJjc3moGRu/Hx/kq0ljhvPznvtfbIUfcaHbEhcqa0A8Iao1K6imb0G/zZJKrmJfH1i/sVAWRbbvjhV6Q6y8Givnogg0X3lfSWSXRDBM3iDMqCsmkX3RESSw9kD8h2+UXgFOIqCGu1Mg5qiEXRylLAAwwYtc83z/SYZy+248Qxh1VN3iuR8016jp/UhOI4aPYTYChCpXb1BJIuQ22RS1u2rub0vgCzqtr1SO6iDiJCrFFRaQ4VHWsFKjo6rUfaV9xNmWVigchpkZb+LZUI4+N8wnEGsuMS2ZNf2AMlV2oBbgHNy9t7MW3CfVc0ta9KFV/E9dKKqrLA7BPnOt2ve7LeD67r2Sh43no1/SM8Q7hEb0KjpvlxbwVXd1WSQsQi5TaZ0oxvd+ZSKLG4RtMQqNarKR9PggyQbZDsp2VfeWcZNGwq7qEvsSrJwHJ26prAH3IMcbE4ByqkpnFcSB50oa+YqkseW+OZAdCq1dlhi/Q3HA4sah63assRWADtLNSr0ruki1DZZR+TyQYXO7b6WWKWSqBJqRbBBdjWK4d5QiA0nsUteP4GDBEsrYgS6UUhxIwebZ4Dht7PSJ5eDYvTQacniEG+c6rcLsNpiHwccDxnY7+lfutgKiG4Y2tUglL6KBW2SXcH57qAYzwe/uC6odqUoKxDFIfrKOwocJEGgjN++NYfd9ohyN+ra9gMEFPRgcw2w7D6hKI45zyBGF51m6VAQCTK7wjcMR3wUZ1Aw0SvCR2CF2AqYzlLZ1XxJxioWtEl6Bed7g2JsW4yjVKhSWHiKpgD1C9lFofnKq+awj/yqm7YZNc+Kmq4MSpnCKPhLcRF6PQLkJuSKTmZPlbT3lnRqmynEBPkm9PcSDpqs7BFerWDdXB0rHMjSNin237vCwqeU9kZjkQqSIlbZoBbpa7tmtuw3Ud3+vruuR/iDlbWloPMVeacF575fac2oYGguP8E57qt/LebXLPXX7Y/AW969fLiUPv2zHOVTZ2Ja1gPDXHTRFfzZcSaFwXz2TODTNuAgK8kHOibWzBYo8szUP8MYilaTBYmGn/7JToMnR3OD9+wY21NnHQbidoc7E8TpsNRk4SaGH/88RzlTqzM9E38BEI58MdKFFWyWXCVKTRZs2rhz+XOVdbyLmo9K8SUs0btW3m12AcJwv+3dy8dMGBfSbj/YgU63nKlJNJIBCaAq4AqiV01N7gepgKK4o21s2FzFJENNFma69VrOT3Woa5WVLzGJRjYZ0aCqNTQUAV57HM24G2OiiZjUZOGm9g92gvAV5QrxqnsnaLgEyxNtMbxsXT5g445klpws2NT+7YMNfeTxIdrY8PlK/Y8QLexDLGN0P29+9/IpO3bqa9BX2z/ZqfF1nwWoq4y85lhFOSxuD9mTsRO9IlmwHTs/zyG89ckgPGk5UmxDAp4sYWz7mxEy/KeTrsdOGydbTdYXo+x3BXdExo+3uHv5D5mMR1KvgFj5PhJuTxqfu4r96Er1PBxOT4rMA4CFPZwi/EeIPgLUR2z57uU/6TL/dUgM0UeAehL7/bIkPugS8zvEPxQuT0os8CA//Yz62yYffjDcntT3IvozKLGOGl3x0WCfRIT+2Hr+71hMHBOPM3/nIhTbXo+YILN9MuAnHQeUaCZZz5eAo78PRB6JmD5cCB+Ug3fGAB8z6cxfg9WmYVNdaUNY9UOAnuS0c5JO2LmidhOeLsEHaMOPDN5Up01BVZW+bBbwJKedk4IECxEQSN6DJh0Z38oReAA3vKBlcaqqpzv6WUxykzSQqsjkKHFBPzuW8YQJ987bI3PC686ch/XO2sPQnN2NmkMKgqD/waQPfPfyHzVp5H1sl+/MOa+qfAkrDm8ajhau0Vs6yQQUk8dnPZbzthO2mJLVaFXN+1TafQHHvXv5D5mQf7XXNhlb+c6c06rmP8rPM3d/Dnv38h83ua7H81d9/YWzOPbKnKMbnH8TmeNdl6qq2KtuF/r348z+bhknveVfkicdOv2n//Sf/tN/+k/5vEvue/2n//Sf/tN/+k//6T/9p//0n/7Tf/pP/+k//af/9J/+03/6T//pP/2n//Sf/tN/+k//6T/9p//0n/7Tf/rv1n/6T/99pHCoDHKJpEi/Vb04bAeHpWurkSfKXS4hiu23qheH7eCwdG018kDJuESmSL9VvThsB4ela6uRtjLFJYLS7reqF4ft4LB0bTWyqsxziaCkfqt6cdgODkvXViNB2eYS22T12G9VLw7bwWHp2mpkmIxXnUcEKKnfKlActIPD0rXViNtVJngEFO/00yvVb1zCVnWXbIHioB0clq6tTtaUTR4RlTXH/L1SVbBAcdAODkrXVitjygGLdD6foES9eKCMOt7uki1RHLKDg9K1lfcfaM9f5hBeQ235botfeqWyFTwpE8UhOzgoXVu9rJMxG+IPyJV1zVhRNriEV2IFixQH7OCQdG010zxikuqOV1JTM8aVfS6RFFvBIsVBPjjOJfBKbasZtwQW2cFqBkX01KCZMcotUsUTM1McSoIz1swxyULXTX1f2WOOZh0E8iJ0XTva5H3/PAKVvVftUHGID45z4YBTalvtuFkybg95w0MoUtOOaTI5WxzCQ4lVPDlDxYEkOFPN3AMeqW3143bAHt1hRihBR50NkYttDgFyOVXx9EwVh+rgcHRtddTYpfM8d8zxPuhoii7QF474wxyU5KpYrDhQBweja6slN0JH7yFTbEmFhzkxoqeJ6jlkC0LnECp8p9rB4uARnOlmDo6h68lN0fHbY4l3C2cBpQ1n9WSGbkHIHQSKreZpmiwO0cGh6Nrqyi2WxnCDHb7MhC/NkIvamrD3yFR9yhuk0jegzRaH5+BAdG315aZRyvIgJ0geoKactjR26dkhxrhVqn0D2nRxaA4OQ9dWZ25ktzSSB5sTdR4QEA9yyo/obPKi1J45xxVmQ8VvQHuD4qAcHISurd5cYwfXZHt+Krv7C//FZpIszG/jmuw09PjqSpsl+gH3yKt+A9qbFQfi4BB0bXXn3Ow+WCJJnNNle+YaO7xrJ0Ne/YdgcZAODkDX1gG4+tIRP0gSnNNneyb2N+Jo63cOIH40biTooEE/Whygg4PPtXUMzjXXD3mBuKTHVX1jB6yRew2nxfz/c+IcHHqurcNwzo2teS7g16LTZ6znC7/f15xv4prnhwJn2JlDYeiOxTk3PL/afvDkF/6rTbEP2ikMO81m4ZAnSLoW/+ZmxWGDAmfqmQNg6Fp4H3kzON3+kxv4dkzu4/2bLt5P3oyvbN1/fMgCL6gnuxXJz5Xzfxs+ar659Z/+03/6T//pP/2n//Sf/tN/+k//6T/9p//0n/7Tf7f+03/6T//pP/O054U7xmxeHd4AaBmz+cLpewBjxmyxYD8CWDBmSwX/tfDImC0U5EJ9sKYhW0LhYuAsBxYN2VCIIp8LzxpGbCEWksj5CwBrRmzIgZdWRL4BwKQB2yQK30XltbrqD+O1+KwgA8RxTgYT9gHHlPwgg9n6xkly+CLlfMpV1szVnyeA1Os/iXQc4qKZes87CbGLg8iXnObRwljLJP1dwvpWTAE0P6RTjl/nZuvI8UAX+f7SYO3FN+ku9vOZoVpMVrrPxdeP70+fG6S5N/bD14sOVXsBAA==)

Important

The steps in this section are validated for QCS9100, which uses Hexagon architecture V73. If you follow the link provided on AIHUB, follow the export commands for Snapdragon X Elite because it also supports V73 architecture.

The following are the steps to create LLM model binaries using AIHUB:

- [Prerequisites](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie#requirements)
- [Detailed instructions](https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie#step-2-export-qairt-compatible-llm-models-on-the-host-machine)

The following is an overview of LLM on-device deployment:

1. Prepare the model.

    1. Start with the desired LLM (for example, Llama 3.x series) from Hugging Face or another source.
    2. Use the `qai_hub_models` Python package to export the model:

        This process:

        1. Downloads the model weights.
        2. Uploads them to AI Hub for compilation.
        3. Generates QNN binaries split into multiple parts for NPU execution.
        4. Creates a deployable folder (`genie_bundle`) with all required assets (context binaries,
configs, tokenizer).
2. Compile and quantize the model.

    1. AI Hub compiles models into optimized binaries for [Qualcomm AI Runtime (QAIRT) SDK](https://docs.qualcomm.com/doc/80-63442-10/).
    2. AI Hub supports quantization (typically 4-bit internally, though weights may be stored as 8-bit for compatibility).
    3. Export scripts handle splitting large models into prompt processors and token generator components.
3. Deploy the model.

    The following are high-level steps of the deployment process. For detailed instructions and commands, see Run LLMs with Genie.

    1. Install the Qualcomm AI Runtime (QAIRT) SDK on the target device (Android, Windows, Linux).
    2. Copy the compiled binaries and configuration files to the device.
    3. Use [Qualcomm GenerativeAI Inference Extensions (Genie) CLI tools](https://docs.qualcomm.com/doc/80-63442-10/topic/tools_tools.html)
(for example, `genie-t2t-run`) or [Genie dialog API](https://docs.qualcomm.com/doc/80-63442-10/topic/api-rst_file_include_Genie_GenieDialog_h.html#file-include-Genie-GenieDialog.h) for inference.
    4. Ensure the target device meets the following requirements. The steps in this section are validated for QCS9100, which uses Hexagon architecture V73.

        - Hexagon architecture: v73 or newer
        - Required RAM:

            - 16 GB for 7B models
            - ~12 GB for 3B models
4. Run the model on-device using Genie APIs integrated with [Qualcomm AI Engine Direct](https://docs.qualcomm.com/doc/80-63442-10/topic/index_QNN.html).

    - Genie manages multiple binaries and execution orders for optimal NPU utilization.

Important notes

- AI Hub advantages

    - automatically handles model compilation, quantization, and splitting.
    - Provides pre-optimized models and bring your own model (BYOM) support.
- Genie

    - Simplifies inference by abstracting complex execution steps.
    - Offers APIs for text-to-text and dialogue-based interactions.
- Customization

    - Export flow defaults to 4-bit quantization for runtime efficiency.
    - No direct option to store weights as 4-bit; they remain 8-bit but load as 4-bit during execution.

Last Published: Apr 02, 2026

Previous Topic
 
Prepare a GenAI model Next Topic

Prepare a GenAI model using a Jupyter notebook