- Implement 5:1 ratio comic-style chat bubbles with character images - Create modular design with character-specific assets - Organize assets in /characters/{name}/ directories - Add fallback system using Example character - Support both speech and thought bubble types - Maintain backward compatibility with style parameter 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
115 lines
4.6 KiB
Markdown
115 lines
4.6 KiB
Markdown
# SillyBubble Image Generator - Design Document
|
||
|
||
## Component Architecture
|
||
|
||
The SillyBubble image generator creates comic-style chat visualizations with a modular component system. The final image has a 5:1 width-to-height ratio, composed of layered elements.
|
||
|
||
### Core Components
|
||
|
||
1. **Background Layer**
|
||
- Full 5:1 ratio base image for setting style/mood
|
||
- Can be themed (e.g., fantasy, sci-fi, cozy, etc.)
|
||
- Provides consistent canvas for all other elements
|
||
|
||
2. **Character Layer (Chibi)**
|
||
- Positioned on the left side of the image
|
||
- Various character images with different expressions/poses
|
||
- Takes approximately 20% of the total width
|
||
- Named by character (e.g., "bianca.png", "ruby.png")
|
||
|
||
3. **Bubble Layer**
|
||
- Semi-transparent speech bubble
|
||
- Positioned to the right of the character
|
||
- Includes a pointer/tail connecting to the character
|
||
- Takes approximately 70% of the total width
|
||
- Various styles (rounded, square, cloud, thought, etc.)
|
||
|
||
4. **Text Layer**
|
||
- Dynamically rendered text content
|
||
- Positioned within the bubble boundaries
|
||
- Supports word-wrapping and styling
|
||
- Font options compatible with the overall theme
|
||
|
||
## Implementation Approach
|
||
|
||
### File Structure and Naming Convention
|
||
```
|
||
/characters/ - Main directory for all characters
|
||
/Example/ - Example character (fallback)
|
||
background.png - Background image
|
||
character.png - Character image
|
||
speech.png - Speech bubble
|
||
thought.png - Thought bubble
|
||
/Bianca/ - Another character
|
||
background.png - Background image
|
||
character.png - Character image
|
||
speech.png - Speech bubble
|
||
thought.png - Thought bubble
|
||
/fonts/*.ttf - Font files
|
||
```
|
||
|
||
Each character has their own directory containing all assets. If a specific character's asset is missing, the system will fall back to the Example character's corresponding asset.
|
||
|
||
### Image Dimensions
|
||
- Total Image: 2000×400px (5:1 ratio)
|
||
- Background: Full 2000×400px canvas
|
||
- Character: ~400×400px (20% of width)
|
||
- Bubble: ~1400×300px (70% of width)
|
||
- Text Area: ~1300×250px (inside bubble)
|
||
- Remaining 10% (200px width) for margins and spacing
|
||
|
||
### Parameter System
|
||
The enhanced image.php will accept:
|
||
- `q`: Text content (required)
|
||
- `character`: Character to use (e.g., "bianca") - if not provided, defaults to "Example"
|
||
- `bubble_type`: "speech" or "thought" (defaults to "speech")
|
||
- `style`: Legacy parameter, can be used instead of character parameter
|
||
|
||
**Backward Compatibility**:
|
||
- If only `style` is provided (no `character`), the script will use the style value as the character name
|
||
- This ensures the SillyTavern extension doesn't need modification
|
||
|
||
### Image Composition Process
|
||
1. Load or create background layer (full canvas)
|
||
2. Check if character exists and overlay on left side
|
||
3. Position and overlay appropriate bubble template
|
||
4. Calculate text boundaries within bubble
|
||
5. Render text with proper wrapping and styling
|
||
6. Output final composed image
|
||
|
||
### Dynamic Bubble Generation
|
||
- If no bubble template exists, dynamically draw a bubble
|
||
- Support both template-based and on-the-fly bubble generation
|
||
- Ensure proper connection between character and bubble
|
||
|
||
## Visual Representation
|
||
|
||
```
|
||
FINAL COMPOSITION (5:1 ratio):
|
||
+--------------------------------------------------------------------------------------+
|
||
| |
|
||
| +--------+ +--------------------------------------------------------------+ |
|
||
| | | | | |
|
||
| | CHIBI |<---+ TEXT CONTENT | |
|
||
| | | | | |
|
||
| +--------+ +--------------------------------------------------------------+ |
|
||
| |
|
||
+--------------------------------------------------------------------------------------+
|
||
```
|
||
|
||
## Feature Roadmap
|
||
|
||
1. **Basic Implementation**
|
||
- Support for character-based styling
|
||
- Simple bubble positioning
|
||
- Proper text wrapping
|
||
|
||
2. **Enhanced Features**
|
||
- Multiple character positions (left/right)
|
||
- Various bubble styles
|
||
- Expression selection for characters
|
||
|
||
3. **Advanced Features**
|
||
- Multiple characters in one image
|
||
- Animated GIF output option
|
||
- Theme-based text styling |