Blog

Metadata Driven Content Generation using GenAI

By Nitin Bhosekar Head Analytics, Artificial Intelligence, BI Practice & Executive Vice President

Posted on Sep 20, 2024

metadata-driven-content

Generative Artificial Intelligence (GenAI) has emerged as a transformative technology with the potential to revolutionize various industries. GenAI offers a powerful tool for streamlining processes, driving efficiency, and achieving scalability. Organizations can automate mundane tasks, improve decision-making, and enhance overall operational performance using GenAI.

GenAI has applicability across industries including Healthcare, Manufacturing, Insurance, Banking, Legal, and within Enterprise for all departments like IT, Finance, HR, Sales, and Operations. GenAI works best for use cases where content needs to be generated.

Metadata Driven Approach:

We usually use RAG pipelines to ingest historical documents and then generate new sections of text for specific use cases. In general, we implement solutions that generate a specific list of text sections or chunks. Any changes to the list of sections/chunks will require changes to code and testing.

In comes the metadata-driven approach that can be used to generate sections/chunks in a configurable manner with the addition of section definition and dependencies on other chunks with minimal code changes.

We first define chunk definition that includes its metadata including name, vector index, category, chunk dependency, and version among others. We can define as many chunk definitions as required even across departments and use cases. It is also possible to have different kinds of chunks in the Organization, some can be generic across enterprise while others will be specific to use cases or departments.

Once the solution is implemented and the request is received with a list of chunks to be generated, it first creates a Dynamic Chunk map where it arranges chunks into multiple levels based on dependencies. Once Dynamic Chunk map is ready it starts generating Level 1 chunks in parallel. After Level 1 Chunks is generated, second level chunks generation is initiated. It can go onto any level of chunk dependency, and we have tested up to 5-6 levels of chunks.

Once all the chunks are generated those can be used to populate template documents for target use cases like RFP responses and legal contracts among others. The same solution can be used to extend to new use cases by ingesting historical documents and defining Chunk definitions and dependencies.

System prompts used in the solution are externalized and prompts for new chunks will be defined in YAML files and will be part of Chunk definitions.

Below is a schematic representation of the complete flow including data ingestion and chunk generation.

schematic-representation

Flow includes ingestion of historical documents/data from the Enterprise Knowledge base, chunking and storing into multiple Vector databases based on configuration.

The user will input details from the Document Generation App that will send requests to the backend service generate a dynamic list of chunks and populate the target document template.

Business Benefits:

  1. Scalability and Cost Effectiveness:
    The solution offers scalability and cost-effectiveness by enabling its application across various use cases with a standardized implementation. This approach contributes to significant cost savings.

  2. Increased Efficiency and Productivity:
    The solution can significantly speed up the GenAI use case implementation by using a configurable metadata-driven approach for chunk generation.

  3. Improved quality and Consistency:
    The solution will help improve the quality of chunks generated and will have better consistency.

About Dilbagh Dhindsa

Innovation Head AI and Data Analytics

Dilbagh is a hands-on leader in Generative AI, AI/ML engineering, Data Science and software development. With over 20 years of International experience. He has developed groundbreaking AI and Generative AI solutions for global customers that helped solve complex business problems and optimize processes.

He had developed GenAI Accelerators for generating Sections of SoW(Statement of Work) using innovative metadata-driven dynamic chunk mapping. A US patent have been filed for the solution. Other GenAI Solutions included Secure Private GPT, an Email processor for license information, Recruitment tool for matching JD with resumes and chat.