FAQ
Installation & Environment Setup
How do I install Second Me on Windows / Linux / Mac / Dock
Recommended solution: Use Docker (cross-platform support: Mac, Windows, Linux).
Notes for Windows users:
Additional installation of
make
is required (via MinGW or WSL).Not recommended to use native Windows environment (not fully tested).
Non-Docker installation: Ensure all dependencies are installed (e.g., brew, poetry, Python 3.12).
Advanced users: Bare-metal deployment on Mac is suggested for maximum performance.
Can I shut down my computer during training?
Supports checkpoint resumption: Training progress is saved in
resource/
anddata/
directories. Restart to continue training.Note: Shutting down will terminate the current training process; the service needs to be restarted.
Does it support GPU acceleration?
Under development. Docker GPU support can be combined with local Ollama.
How to use proxy or resolve network issues during installation?
Select different sources based on your region/country for installation.
Model/Training
How do I train with a local model (e.g., Ollama, Gemma, Qwen)?
Docker users: Replace
127.0.0.1
in the API Endpoint withhost.docker.internal
.
Why does the model fail during training?
Common causes:
Insufficient Docker memory limits (increase memory allocation).
Incorrect model configuration (verify parameter compatibility).
What to do if ChromaDB reports embedding dimension mismatch?
Solutions:
Delete
data/chroma_db
and retrain.Ensure embedding model dimensions match (e.g., 768 vs. 3072).
Appendix:
Why is embedding failing with OpenAI error even when using Ollama?
The OpenAI SDK is used, so error paths may include "OpenAI," but requests are sent to the configured model endpoint, not OpenAI's service.
What is the recommended size for training data?
Keep between 10k~100k for stability. Larger datasets may cause timeouts or memory issues.
Can I reuse API calls to save money on retraining?
Yes, intermediate data is saved and won’t trigger repeated API calls.
Features & Architecture
What’s the difference between Second Me and me.bot?
SecondMe: Open-source personal LLM framework.
Me.Bot: An online app based on this framework.
Can I run multiple Second Me instances?
Supported: Ensure sufficient hardware resources and resolve port conflicts.
Can I use Second Me in my own agent framework?
Open API and MCP service support for direct integration.
Why does embedding stage use a different model than chat stage?
Technical reason: Not all model vendors provide both interfaces, openai does, but DeepSeek for example does not (for now). We need both interfaces during training, so we need to configure them separately.
Errors & Debugging
No rule to make target 'setup' error?
Troubleshooting:
Confirm you’re in the project root directory.
Verify Makefile integrity.
"Too many open files" during training?
Possible cause: Memory leak, please submit an issue to us if you encounter this situation.
When reporting issues, include:
OS (Mac/Linux).
Memory configuration (e.g., 16GB).
Docker version (if applicable).
Note: Avoid sharing private data in logs.
Can’t enter training page or web UI crashes?
Debug steps:
Run
make status
to check service status.Verify no network conflicts (e.g., port occupancy).
What does "entities.parquet - no such file or directory" mean?
Cause: Insufficient data extraction model capability.
Suggestion: Switch to high-performance models (e.g., OpenAI API).
“Permission denied (publickey)” when cloning repository?
SSH key not set up. Use HTTPS instead:
Why "internal server error"?
Typical cause: The probability is that the maximum length of the chunk exceeds the limit because of the inconsistency between the configured embedding model and the maximum length set by the project.
Action: You can adjust the parameter EMBEDDING_MAX_TEXT_LENGTH in the
.env
file according to the specific parameters of the model.
Step "generate_biography" failed?
For paid models (OpenAI/DeepSeek), common errors:
openai.BadRequestError: Error code:
400 - Bad Request
Reason: The request body format is incorrect.
Solution: Check whether the model name and API key are correct (there may be extra spaces after the model name).
401 - Unauthorized
Reason: Invalid API key, authentication failed.
Solution: Verify that your API key is correct. If you don’t have one, create an API key first.
402 - Insufficient Balance
Reason: Insufficient account balance.
Solution: Check your account balance and top up on the recharge page.
422 - Unprocessable Entity
Reason: Invalid parameters in the request body.
Solution: Adjust the parameters based on the error message.
429 - Too Many Requests
Reason: Request rate (TPM or RPM) limit reached.
Solution: Plan your request rate appropriately.
500 - Internal Server Error
Reason: Server internal error.
Solution: Retry later. If the issue persists, contact the server provider.
503 - Service Unavailable
Reason: Server is overloaded.
Solution: Retry your request later.
For errors like
Biography generation failed: must be......, not.......
orExpecting value......
:Upgrading Models
Select a more capable model to ensure that the generation capabilities meet the demand.
Switch to an API model
Switch to a cloud-based API service that supports the OpenAI protocol to circumvent local arithmetic or compatibility limitations.
For errors like
Biography generation failed: Request timed out
, it is usually due to in sufficient local computing resources, resulting in model response timeout. The following optimization measure are recommended:Use cloud API services
Use APIs that support the OpenAI protocol to call the model, avoiding local hardware performance limitations and ensuring stable generation.
Issue with Embedding Model?
Usually the probability of embedding failure is very low and can be solved as follows:
Use a better model (e.g., OpenAI) or host a local high-performance extractor.
sqlite3.OperationalError: no such column: collections.topic
?
sqlite3.OperationalError: no such column: collections.topic
?Delete the data directory where ChromaDB stores data.
Restart the application to reinitialize ChromaDB (either restart
make restart
or make docker restartmake docker-restart-all
depending on your platform).
Training stuck at "Training to create Second Me -> train"?
Resource suggestion: the training process takes up a lot of memory, allocating more memory can speed up the training, 16G or even higher is recommended.
Other Questions
Can I use Logseq, Notion, me.bot logs for training?
Yes, convert to plaintext/markdown before uploading.
Why are some of my memory files missing after upload?
Current UI displays only 100 files; pagination is under development.
Does Mindverse recruit interns or collaborators?
Yes! Contact Scarlett or Kevin for opportunities.
Last updated