intro for project meilin and linne platform

Virtual singer & LINNE

CC-BY-NC

Slide author

(Chou Shouichi)/ MGdesigner

Paul Liu and me organize

Wikimedia.tw: member of board of directors

(and direct tech development )A programmer

A musician (Jazz ukulele, DTM)

[email protected]

Everyone knows her

Powered by

Yamaha Vocaloid2 engine

So

Why a FOSS 'v'ocaloid?

If you buy an instrument

You can play any song,Do anything.

play

By teeth

Break !

burn

In any Vocaloid product EULA

You didn't get whole rightsno anti-society works

(so,What works are anti-society?)Trademarks protection (images, keywords)
(ex: 'Vocaloid' ,'','''s image)

No using Miku images=not popular

musicians are controlled

No freedom

Be ruled

Using a Gibson guitar,you are its master.

Using Vocaloid products, You are their slave.

INDIE DIE

UTAU

A free vocaloid-like

DIY avocaloid

Programs: editor(frontend)+resampler+wavtool

data: vocal DB - oto.ini + wav samples

Vocal DB is an open spec ,many people DIY

vocaloid programs working flow

Editor: compose the melody(many notes)

Resampler: modulate a sample to Specified pitch,or other parameters (velocity...).

Wavtool: combine these modulated wavs

Finally,we get a song vocal wav file,and mix into a song

but

Charge no fee,not freedom

Default resampler work badly

DB bad international support (S-JIS)

Oto.ini no implementing ini comments ;

UTAU always auto sort oto.ini (hard collaboration)

Hard UI control

Not open source

Its development is very private

And you know ...Yamaha owns many superpatents

A nice free vocaloid IsVery difficult

During 2011-2012
One day, Paul Liu talked to me

New Algorithm, 'World' better than Vocaloid2

Author: Doctor

Patent free

EFB-GW(Synthesizer) for UTAU

Open source(old version GPL,newer is BSD)

https://github.com/mmorise/World

During Dec,Dr.'ll do another great upgrade

How good is World algorithm?

very awesome 'autotune'

(original official test is a realtime Karaoke autotune for s. )Modulate a sample to any pitch without distortion (Keep F0 well)

(Vocaloid2 can't ,so Miku need 3 different range versions of each sample)Very fast ,no need to pre-preapre frequency tables

(Just do it real time)If X86, Even works good on older machines(maybe on ARM)

Ok Let's do it!

Finally

we made her...

Listen...

Hear MAMA cover

:Ancient Chinese,JapanesePentatonic scale note. (Do Re Mi Sol La)

Also means We 'recruit' a voice actor(and also a Jazz singer) from Internet

Merlin(super wizard)Linux

http://projectmeilin.github.io/

Project Meilin Features

CC-BY

Utau compatible

Professional recording(in studio)

Src:24bits 48000hz wavs

VCV VC

(V - Vowel c - Consonant)Recorded: Japanese,Mandarin(Taiwan style)

How good? A test

Commercial Miku VS. open content Meilin

V2 Miku each sample recorded high,middle,low versions

VS.

Meilin each sample just record 1 version.

Listen to the comparing video( song: , Start from 0:44)Especially check super low pitch and super high pitch if is distorted?

fact

Miku DB:1gb+

Only Japanese

Meilin DB:627mb

Japanese+Mardarin

Mardarlin DB is 3 of JP DB

thank to Dr.

Without his effort and kindness, a good FOSS virtual singer is imposible

2 moreSpecial features

1: 14 Special effects

Defined in oto.ini3 breath : br1,br2,br3 ( ex:Miku only have these breath. )

Spanish 'R' rolling: trill

Cough: cough

Cry,dry tears:drytears

Blownose: blownose

Sucking: suck

sigh():sgn1,sgn2,sgn3,sgn4

Whistle :whsl

clean throat: clnt

2: possible

EX:'' in (video)

in Mandarin ,there is the same 'u'

Just borrow what we recorded.

also can borrow other Mandarin samples for synthesizing or some foreign languages.

(ex: 1 or 2 foreign lyrics in a Japanese song)

'v'ocaloid also can do speech synthesis

Better than traditional speech synthesis

Accent(= pitch,velocity,rhythm,speed) controllable

Could do many emotion(melody lines) : cry,angry...

TTS,story telling,emotional '' possible

Some tests which I have done by Miku: 1,2,3,4 based on my scale algorithm. 'Auto render' possible,but.

If use Vocaloid to do this,you need to beg YAMAHA for opening API. But our software stack are open source. She could do more than singing.

How made?

Recorded in a pro studio

Thanks to sponsor (Aguai),my master

(A famous POP song producer in TW.)

About the vocal

Her name is (Lo Chu).

We choose her voice from 20 girls from on internet.

She is a singer in a JAZZ / anime cover song band.

Also vocal acting trained.

Japanese accent not bad.

Japanese friend ATsushi

But very hard work

Japanese recording need 3~4 hours.ButIntact Madarin(possibility on math ,then minus repeated samples by Phonology)

Madarin recording needs days.

The final day

LINNE platform

We defined the FOSS 'v'ocaloid stack

Of course opensource

Compatible with Utau DB (but UTF-8)

resampler+wavtool+editor(interface)+DB -making tools

May include 'hardware'

Hardware Ex: Doll robot

Our Oto.ini DB spec

You can use ';' for comments

Editors programs shouldn't resort the file

UTF-8

IPA based (International Phonetic Alphabet)

By IPA,Different languages could use common pronunciation samples

(no more re-recording again, keep the DB size smaller, more storage efficiency )

Engine (now is xvsqExec ,may need to be changed)

JcadenciiLinne-editor (in dev)(song editor,front end)

Wavtool-pl(GPL wavtool)tn_fnds_yc (gpl)(resampler,EFB-GW variant )

World libOther programs in the future
ex: linne-TTS

The chart may need evolution.

Problem now: the editor(frontend)

Cadencii is written by .net with binding too many Windows native calls

Jcadencii is very slow (Cadencii java port)

Upstream dev stopped. We also give it up.

Another open Utau frontend: http://fluidvocalsynth.weebly.com/ (also .Net)

linne-editor(frontend)

https://github.com/marty1885/linne-editor

In very earily development

fact

We don't have enough manpower about interface coding

When normal users edit, still need wine+Utau

Similar to early Linux dev in Minix >_

intro for project meilin and linne platform

Technology