Summary

Mushie is a homemade mini humanoid robot that my girlfriend and I built together.

Contributions

My girlfriend focused on the design, scultping, silicone casting, painting, and animation, while I handled the software, hardware, and electronics.

Duration of the Project

The project took around six months to complete.

About Mushie


Mechanical Design

Mushie's mechanical parts were designed from scratch using Fusion 360. Some aesthetic elements were done in Blender as well as Plasticity. There were two major considerations for Mushie's mechanical design:
  • Compact size - the head is approximately 10cm x 10cm.
  • Ease of assembly.
Many iterations were carried out in order to satisfy these criteria. Some novel mechanisms, in particular, the mouth and the brow mechanism, were invented and implemented.

Electronics, Actuator and Motor Control

Mushie has ten BLDC motors, each powered by a custom board with ESP32 microcontroller and SimpleFOC mini boards. The SimpleFOC mini boards utilise DRV8313 drivers. Each actuator is close-loop controlled via AS5048a encoders, and has varying gearing to account for difference in torque requirements.
Actuator Control Algorithm
The control algorithm is a custom close-loop controller based on PID control, but with additional modifiers to account for sensor noise.

Deep-Learning Based Lipsyncing Algorithm

Mushie's lipsyncing algorithm is a deep-learning model based on the transformer architecture, trained from scratch.
Specifications
  • Low latency inference ~= 10ms for several seconds of audio.
  • Predicts lip-corners (oo and ee vowels) and jaw movements.
  • 141 experimental training runs, with many algotithm modifications.
Visual evaluation and comparison with commercially avaliable solution can be found here: Lipsyncing Evaluation

Mushie's Voice and Singing

Speaking Voice with GPT-SOVITs
GPT-SOVITs is an open source project made for high quality text to speech. We created Mushie's voice by sampling my girlfriend's voice, which was modified and then used to train a custom GPT-SOVITs TTS model.
Singing with RVC (Retraival-Based Voice Cloning)
The RVC project is an open source project developed for artistic voice cloning. Mushie's singing is performed by us and then transferred through RVC.

Mushie's Software

Mushie's software is written in Python. Communication with ESP-32 controllers is done via PyUSB and a custom designed protocol which talks to SimpleFOC.

The software also has a simple web interface which allows for debugging and tuning range parameters for the motors.

Since Mushie does not yet have an auto-calibration system, the motor ranges are tuned manually.

Future Improvements

Better Power Electronics
Currently, Mushie is powered by two separate USB-C cables, but the entire robot can be driven with a single 100W USB-C PD cable.
More Compact Actuators
The current actuators have separate sensors and controller boards. In the future, I would like to design a new custom board utilising CAN bus, and have an integrated magnetic sensor on the backside.

Cat Robot - Work in Progress

After Mushie, we began developing a new social robot with cat-like appearance.

The cat robot will have stretchy fabric skin instead of silicone.