
PO Scraper
I built a web application that automates purchase order (PO) processing by extracting key information such as the PO number, salesperson, line items, date, and company from PDF files with varying supplier formats. Using AI-powered data extraction, the app converts POs into structured JSON and streamlines entry into a client’s CRM, eliminating tedious manual work and significantly reducing errors.

Technologies Used
React
The frontend UI was built with React Native
Express (Node.js) & FastAPI
I used Express to build the main functionality of my backend. I used FastAPI to handle the raw text extraction.
Vector Database
I used Qdrant's Vector Database to store previously verified examples of various PO formats and then used them as context to get more accurate results from my PO Scraper.
OpenAI API
I used OpenAI's API to access their LLM and prompt it with the raw text, the context, and the information to be extracted.
Outside the code
This was the first time I was given a defined problem by a company to solve. I learned a lot about matching expectations with reality, especially in the age of AI.
Another thing I found to be extremely important was checking in with the client on a regular basis to show them progress and ensure what you are building will fulfill their needs.
With this particular project, bridging the gap between what I was building and what the current process was to record PO data was extremely important in ensuring a more efficient process.

What I learned
This was my first time really interacting with an LLM through code, so I learned a lot about how important it is to give the model as much important, but concise information as possible to get the most accurate results. In my case, that meant setting up a RAG pipeline which increased the accuracy of the system immensely
This was also the first time I had built a robust full stack application outside of school/personal projects. I learned how important it is to document everything as you go. It is often easy to throw things together in order to get it working, but in the long run it makes it much more difficult to make changes.