PDF-to-Rule Converter for Manufacturing Insights
in collaboration with HCL Tech
LLM/NLP pipeline that extracts and parses dimensional rules from technical PDFs into structured JSON.
- Fine-tuned an LLM/NLP classifier with class-weighted training to handle imbalanced rule/non-rule sentences across vendor PDF formats — improving classification accuracy by 48% over baseline.
- Built a LLaMA 3 prototype that parses extracted rules into structured JSON via constrained generation, achieving 78% label match on a 300-sample internal test set.
- Reduced downstream manual parsing workload, enabling an estimated 4× faster review workflow for engineering teams.