WeClone: what it is, what problem it solves & why it's gaining traction

What it solves

WeClone provides an end-to-end pipeline to create a digital avatar of a person based on their actual chat history. It allows users to clone the speaking style and personality of an individual by fine-tuning a Large Language Model (LLM) on exported messaging data, effectively creating a bot that mimics a specific person's "flavor" of conversation.

How it works

The project implements a complete workflow:

Data Export & Preprocessing: It supports exporting chat records from platforms like Telegram (with support for images) and is building support for WhatsApp, Discord, and Slack. It uses Microsoft Presidio to filter out sensitive private information (phone numbers, emails, etc.) and allows for custom blocklists.
Fine-tuning: It uses the Qwen2.5-VL-7B-Instruct model by default, employing the LoRA (Low-Rank Adaptation) method for Supervised Fine-Tuning (SFT). It integrates with LLaMA Factory for model training.
Deployment: The fine-tuned model can be deployed as an API server or integrated into chatbot frameworks like AstrBot or LangBot to be used on platforms such as Discord, Telegram, and Slack.

Who it’s for

Individuals wanting to create a digital twin or a personalized AI assistant that speaks like them or a loved one.
Researchers experimenting with personality-driven LLM fine-tuning and multimodal (text and image) chat data.

Highlights

End-to-End Pipeline: Covers everything from data export and cleaning to training and deployment.
Multimodal Support: Supports fine-tuning with image data to better capture communication styles.
Privacy-Focused: Includes built-in PII (Personally Identifiable Information) filtering to protect sensitive data during training.
** uma**
Flexible Deployment: Compatible with various chatbot frameworks and messaging platforms via an OpenAI-compatible API server.

WeClone: what it is, what problem it solves & why it's gaining traction

WeClone: what it is, what problem it solves & why it's gaining traction

What it solves

How it works

Who it’s for

Highlights

Sources