In Google docs™, DocQA™ add-on is a good tool. Its functions include but are not limited to: providing document Q&A, extracting information from pdf documents or image documents, and answering your questions based on the information, document OCR, invoice OCR, invoice Q&A. It is powered by state-of-the-art DocQuery AI, Document AI. Document Query Engine Powered by Large Language Models, including: Microsoft LayoutLMv3 AI, Microsoft OCR-LayoutLMv3-Invoice AI, impira/layoutlm-document-qa AI(Microsoft LayoutLM Model fine-tuned version based on document data), impira/layoutlm-invoices AI(Microsoft LayoutLM Model fine-tuned version based on invoice data). With DocQA™, you can find answers to all your questions in invoices and income statements. Introduction to AI This is a fine-tuned version of the multi-modal LayoutLM model for the task of question answering on documents. It has been fine-tuned using both the SQuAD2.0 and DocVQA datasets. LayoutLM (Document Foundation Model) LayoutLM is a simple but effective multi-modal pre-training method of text, layout and image for visually-rich document understanding and information extraction tasks, such as form understanding and receipt understanding. LayoutLM archives the SOTA results on multiple datasets. For more details, please refer to our paper: LayoutLM: Pre-training of Text and Layout for Document Image Understanding Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, KDD 2020 LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding Yang Xu, Yiheng Xu, Tengchao Lv, Lei Cui, Furu Wei, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Wanxiang Che, Min Zhang, Lidong Zhou, ACL 2021 LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding Yiheng Xu, Tengchao Lv, Lei Cui, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Furu Wei, Preprint