Hello, I've been working on langchain-beam library. Its a langchain and apache beam integration to use langchain's components like LLM interface in apache beam ETL pipeline and leverage LLM's capabilities for data processing, transformations and provide a way to create RAG based ETL pipelines.
recently I've added a feature to integrate embedding models into beam pipeline and generate vector embeddings for text in pipeline using the models so that embedding generation activity can be a part of the data pipeline instead of separate service.
I’d love to hear your thoughts. Repo - https://github.com/Ganeshsivakumar/langchain-beam
Example usage, to create embeddings in pipeline : https://github.com/Ganeshsivakumar/langchain-beam/blob/main/example/langchain-beam-example/src/main/java/com/langchainbeam/example/EmbeddingPipeline.java