hi :333 im making a mini tutorial on how to use openutau!! ill show you how to download, download a voicebank, use a ust, make ur own ust, and tune a bit!

note* everyones tuning style is different, ill tell u a bit abt how it works :D also u dont need to know how to use regular utau for this tutorial. this is mostly for people who dont already know it ^^
and.. this is pretty comprehensive. im mostly making this tutorial for my own enjoyment, so if it goes into too much detail sometimes thats just bc im super passionate abt vocalsynth :] ill basically assume u know nothing lol

1- what the heck is an openutau?

openutau is a different version of utau. its a voice synthesis software that does the same thing as regular utau, but it is newer and has a couple updated features. such as;

  • multiple tracks! put all ur parts and back track all in one place
  • easy multilingual ui and encoding
  • pre rendering
  • support of ENUNU voicebanks, which is an ai system for tuning openutau vocals
  • availability in different countries, so u dont have to change ur locale to dl it!

openutau is the easiest to use in japanese. there are different voices (voicebanks) you can download in other languages, but japanese is quite easy to get started with and sounds the most natural almost all the time


2- downloading openutau, and The Interface

to download openutau, u have to click this link to see the github page.

under the section called 'install openutau,' u can see different opperating systems to choose from. this section also has specific directions for installing the program depending on ur computer/os

after opening the program, ur probably looking at this screen---

theres a lot of stuff to look at! so lets start with the big sections. the giant striped area on the left is where you can put different tracks. music or singer parts. but u cant add any notes from here! we can get to where u put them later.

this is where u will see a singer, when u download one. u get a little picture and their name here

u can click on where it says 'track1,' and u will get the option to change the name of ur track. for example, i name mine something like 'music' and 'main' and 'harmony'

under where it says 'select singer,' theres an option to change ur phenomizer. its set to default when u start a new project. u can change this based off what type of voicebank u use (more on that in a bit--), or u can leave it untouched the entire time

there are a few different menus to choose from at the top. file, tools, and help

file is how u can;

  • open a new empty project
  • open a previous project
  • export the utau files directly or export as sound files
  • save
  • or import other files/music (this can also be done by just dragging in a file)

tools is where u can see all the singers u have installed, install a new singer (soon :3) or change some settings

and help will send u to the openutau wiki. this is especially helpful for technical issues!

this little menu is where u can play the song, (or just press the spacebar XD) or change the time signature or bpm of ur track. if u dont know what those are...but generally i keep my time signatures at either 4/4 or 6/8, even if the song says otherwise. just simpler that way :]


3- installing a singer!

before u pick which singer u want to install, theres some things u should consider.

1, there are differnt types of voicebank. the differences depend on how syllables are arranged. for example, the simplest you can get is CV (consonant vowel, or single sound)- which is the easiest to use but may sound a bit disconnected

this would look like; se ka i de

there is also CVC (continuous sound), which is basically the same as a CV voicebank but you input the previous vowel sound before each new one. this sounds very smooth, since all the sounds are connected :D!

this would look like; -se eka ai ide e-

some other common ones are CVVC, VCCV, and ARPAsing. these are slightly easier to use with other languages. if u want to know how to use these, i reccommend finding another place to learn after u understand the basics!

2, how ur singer is encoded. this isnt something that can affect how ur end product sounds, but its basically abt ur keyboard and how well u know japanese letters.

note* if u can understand japanese hiragana, using openutau will be much much easier for u!! some voicebanks are ONLY available using kana and if u dont understand how to read that- u will be limited to only voicebanks that use normal letters. if u would like to learn hiragana i would reccommend looking here!

a singer can be encoded in 2 main ways. kana or romaji. romaji is the letters ur probably used to, -se eka ai ide

kana refers to (usually) hiragana, which would make the sounds look like this -せ eか aい iで e-

its up to u which format ur more comfortable with. more utaus are available with kana, and some are even available with both

now u can pick a singer! here is a link to find utaus by voicebank type (cv, vcv, etc) :D my first utau was kasane teto, so im going to be using her for the tutorial. she is kana encoded, and has both cv and vcv voicebanks. which u can download separately. here is the link to all her voicebanks. both cv (single sound) and vcv (continuous sound) are under standard! u will have to translate the site. the other voicebanks are mostly in vcv format, they just sound slightly different. u can actually download all of them at once and have her sing in different tones during the same song ^^ (dont do that yet- lets go over how to get her into the program first!)

when u download an utau, u should get a zip folder. DONT UNZIP IT! openutau wont be able to install a singer unless it is specifically .zip

now get back to openutau. under the tools menu at the top, click 'install singer.' you will then be able to look thru the files on ur computer, until u find the voicebank u downloaded. click on that

then u get some basic options. this is how openutau figures out which writing system ur voicebank is encoded in. at the bar at the top of the mini window, scroll thru the options until the letters look like actual letters or kana. this is what it can look like if ur letters arent lettering. the last of the 3 screens u see is just wether or not ur utau is actually an utau. and if u download teto, or anyone on the site i linked, it is!

and... ur singer should be installed. next ur back at the main screen. click on 'select singer'

this is what mine looks like- i have a lot of voicebanks downloaded. but u should see the name of ur voicebank pop up, so click that. then there will be a picture of them in the side panel! and their name should be there, too X3

dont mind i have miku installed as an utau, lol! not allowed with vocaloid TOS but im silly


4- lets make some noise!

to make a part to put notes in, click on the main area in the row where ur singer is. u can zoom out and drag the end of the part to make it longer. the numbers at the top are the measures

if u double click on ur new part, which is pink bc its highlighted, u get a whole new interface. u may also get the tooltips menu, so u can read that to see the basic tools

theres a regular cursor, a pen tool to draw notes, a pen plus tool to draw and delete, an eraser to delete, a pitchbend editor pen, and a knife to separate one note into two. try drawing some notes!

on ur left is the piano roll, which just tells u the note names of whatever u want to put down.

to play, use the cursor tool to place the vertical gray bar BEFORE ur notes. this shows where u are when u want to play. then press space!

unless u already know what notes u want to put down, it probably doesnt sound super amazing. it might not even play at all. if it doesnt play, thats bc the encoding of ur voicebank doesnt match the note on the screen

to change the note on ur screen, double click on it

if u start to type a syllable, any encoded syllables that match the one u typed will show up. this is especially helpful if ur voicebank is kana encoded, bc u dont have to type any kana to get the characters

the name at the side tells u which specific voicebank the note comes from. this is good if u have a couple different voicebanks from the same utau, like i do with teto :D

if u use a cv voicebank, its likely ur syllable is the only one that shows up. if u use vcv, the FIRST note of any group should start with "-" and the last should end that way. these notes are vcv specific notes that make the beginning/ending of utterances sound more natural. remember, -せ eか aい iで e- / -se eka ai ide e-

sometimes the ending note wont have a "-" but will have a capital R instead. it stands for rest, and just does the same thing. depends on the utau

also sometimes, the notes have note names at the end. this is just to tell u where to put them, like a G4 note shouldnt go below the G4 note line. this is bc some utaus have different voice tones depending on how high/low the note is!

also, any note with a dot in it has some vocal fry! anything else is almost always specific to ur particular utau


5- USTS

a UST is a file type that utau and openutau can take. a USTX is specific to openutau, as USTX files can contain multiple tracks. a ust file is good for when u dont want to build up ur own cover from the ground up, just start with a kind of template

u can find USTs pretty easily online, if u just google 'ust download (song)'

u may also find other file types, like a VSQx (for vocaloid). u can technically use these the same, but sometimes stuff gets messed up a bit. i reccommend this website to convert types

for this tutorial, im gonna use this cover of 宴 by LunarConstruct for a UST! check out their stuff, pretty cool :3

CREDIT WHO MAKES YOUR USTS REALLLLY IMPORTANT!! it takes a lot of effort and time!!!

the one im downloading is a USTx, so it has all the parts in one file. otherwise, u have to add them in one at a time :P when u download the file, if its already in the format u want, just drag it into the openutau interface. this is also how u import music files

this USTx had a music file when it was made, but i dont have that file on my computer. that means it says its missing. u can delete that part, or mute it with M, or leave it. it wont hurt u to have an extra nothing part!

also, this is a cv file. u can go in and change each letter individually, or use a site like utaformix to change it automatically. the thing with utaformix, is that it deletes all the visuals for the tuning without deleting it for real... somehow-?

so if u do that, u wont be able to change how the file is tuned at all. i dont reccommend. u should probably do it urself, even if its a bit time consuming. or u can reset the tuning first, and then put it in utaformix. it all depends on wether or not u want to keep the original tuning X3

the red lines that showed up between all ur notes when u put in the UST are pitchbends (if they just look like plain links between notes, the UST u found wasnt tuned)

tuning, or editing the pitchbends, is what makes the difference between a kikuo vocal track and a pinnochioP vocal track, at least for a lot of the time

u can develop ur own style of tuning, from complete autotune-style to as realistic as u can make it :D i can teach u the basics of how to make this stuff work, and a kind of 'standard' way to tune. but i cant tell u everything! u need to experiment!!!

but if u dont want to mess with the tuning of ur UST, u can just export the files as .wav in the file menu and put them together in an external software

i honestly use capcut for this, just bc its free and simple. just drag and drop, and match up timings


6- tuning basics

the first thing i do when im tuning my own UST is to reset the original pitchbends. select all the notes, either with cntrl+a or dragging with the cursor tool, and look in the "notes" menu for "reset pitchbends"

u can drag around the pitchbends all u want. try dragging them around a lot, and a little, just to see what that sounds like :]

u can also change vibrato, which makes the voice a bit shaky, with the little squiggle underneath each note. again, u can drag around some things to see how it sounds

heres an example of how i tuned a bit;

and heres how LunarConstruct did it!

both are equally good-sounding ways to make this sound. just different styles. LunarConstruct has a bit more crisp, pop-y style, and mine starts notes a bit early to make them sound a bit more real

there honestly isnt too much else i can tell u! if u want to learn more, try stuff out on ur own or reference others! i also have a folder of all my tuned USTs u can look at


7- other tools

as u can probably see, there are a lot of other things u can mess with

remember when we talked abt phenomizers? it can change depending on the UST u input, but is normally set to default

this is a way to let ur utau sing a CV UST just like it would a VCV one, or the other way around. just pick the language u are using and go from there!

theres a lot of stuff in this area as well

the gray bit is the waveform of the audio- so it shows volume. if there is no gray bit and u have it turned on, there isnt going to be any sound

the blue sections are called "envelopes!"

they show how much overlap notes have between each other. u can drag this around if u want. i dont usually mess with it, bc openutau is pretty good at automatically editing them for u. u can see the difference if u look at CV envelopes and VCV envelopes :]

the area under it has about a gazillion different things u can change. the five different options on the side are just slots where u can switch quickly back n forth between a few options. click on one and u get the full list!

i wont go over what each one does, but u can experiment.

if ur done messing with ur UST, export the files as .wav and go put them all together :D


8- make ur own UST!

now u know the basics of how to make an utau cover. but- how do u make something from scratch?

the easiest way is to find a MIDI file. this is just like an UST, but without the pitchbend edits or lyrics. but they can be hard to find. u can always make ur own MIDI in another software, but it can be difficult!

to make ur own UST without a midi, all u rlly need is the file of the song u like in openutau and the BPM set in

then, u have to go through with the song at a low volume and try to match up the notes with what u hear. it can be tedious, but if u can do that for a whole song u will have made an entire UST, tuning and all, all by yourself! its rlly something u should be proud of :]


---


okay, openutauer. u learned a whole lot about how to make a vocalsynth cover! if u read this and ever post ur music anywhere, i would be absolutely delighted to hear it ^^ have lots of fun practicing :D













psst... if you want to hear about some more complicated stuff, keep scrolling ;3

im making this months later.. hopefully the top is still helpful ^^





parameters menu

you can operate 5 parameters at a time in the lower menu. they come in curves and blocks, which is basically just a minor interface change

the first parameter is dynamics, which changes note volume

pitch derivation is a second way to tune notes, instead of using pitchbends. up goes higher, low makes low notes

voice color is used when a voicebank has several subsets inside the main voicebank, like teto does. i'll go over this later, and usually you have to set them up yourself

then you can change the resampler engine. if a parameter says "expression not supported by renderer," switch to another resampler

velocity is good for emphasis. lower = more emphasis, which is easy to see if you turn on envelopes

volume is volume ;3

attack is the volume of the first part of the note. this can sound good on consonants, but makes vowels sound kind of wacky. makes end breaths sound very bad, usually

decay is the volume of the end of the note

gender affects the maturity of the voice. less gender makes it sound less mature, kind of feminine. more makes it sound more mature, kind of masculine. messing with this too much may make some funky cracking noises, but probably wont crash your program. probably

breath/breathiness make the notes more breathy! good for whispers

lowpass fits all the frequencies within a certain range, kind of "normalizes" the note. in my experimenting, i have not been able to notice much of a difference ,;3

modulation makes the voice a bit shaky. use REALLY depends on which utau you use! some you shouldnt go past 20, and some you can never hear a difference. a real voice doesnt stay at a constant note the whole time, so this can make it more realistic

i honestly could not tell you what alternate does. i have fiddled with this thing for a while, and even google cant help me... let me know if you figure out what its doing!!

tone makes the voice "hard" or "soft." high tone makes it seem like the note is really being forced or yelled out, low tone makes it softer or almost whispery

tension is similar to tone, and can sometimes make the voice kinda creaky

voicing is how whispery the note is. default is not whispery, but if you bring it down a bit it gets softer


voice color/subbanks

some utaus have several different mini banks inside the main one. teto has a bunch, including her whisper and sakebi banks. to see what banks ur utau has, either look it up or find the sound files in the area im about to show you

in the main track menu, select "tools" and "singers."

most of this you dont have to worry about, especially the waveform at the bottom. this screen is showing you the names of the individual sound files (file), the note names you type in (alias), the subbank (set), and some technical stuff

i'm looking at teto's files, with a bunch of subbanks installed

i see that the edge bank looks like -ta' or aka', with that apostrophe.

on the right, click edit subbanks

click add color, and name what your subbank will be. you can pick any name, i usually go with the name on the website. i'll go with edge. then, select all. type in the set prefix or suffix. teto edge has an apostrophe suffix, so i'll type that in. click set, save, and then it's done!

you can see what subbanks you have set up in the top right corner of the singers menu, or in the voice color section of the parameters. subbanks are changed one note at a time, as they're usually meant to be used side-by-side in one song


extra sounds

most utaus have a bunch of other stuff you can add into your song, like end breaths, vocal fry, glottal stops, or end consonants

to find these, go into the singer menu again. im going to use tokumei merged for this part of the tutorial, since they have QUITE a few extra thingies

you start by looking through the aliases. if something is there that isnt a hiragana letter or a roman letter, theres something cool there.

tokumei's first alias is あ'2. so you just go back to the track editor, make a track, and type that in. and see what it does! in this case, あ'2 is a more "scratchy" "a" vowel, with vocal fry. i HIGHLY recommend creating a note somewhere, either paper or online, about what each thing does. here's mine on tokumei:

and if you'd like to see what my test files look like, here's the ustx (note, the tracks wont sound good if you play them all at once! this is to test individuals) (other note, if you havent set up all their subbanks, some parts of this will sound odd) where i test shione LT, teto, yokune ruko, milk, tokumei, gekiyaku, and udzuki

theres some other things a bank may include.

end breaths, either in or out, are usually marked with an "R." tokumei has R, RR, V, VV, a/, iR, ', E. these all sound different. these go at the end of a note, and either give the note a realistic fadeout or make it sound like the utau is breathing in after they finished the musical phrase. if you see the character "息" its probably a breath out, and "吐" is probably a breath in

along with subbanks, most utaus have different pitches. usually, these change automatically. these are different sets of all notes, sang at different volumes. your voice doesnt sound the same when you sing a really high note & really low note, so the utau's voice doesnt either. of course, some voicebanks with only one pitch are perfectly good banks, like kasane teto. heres a picture of a bank with automatic pitches;

some banks also have ending consonants. an example of when i would want to use this is writing the word "desu." when you say that out loud, you dont say "deh-soo." you say "deh-sss." but an utau would pronounce it the first way, since letters come in clusters. with tokumei, i would use "-de" and "e s"

some also have a rolled "r." tokumei doesn't, and teto's is different than normal. she has one "巻" note that you place seperately and before the r note. most of the time, the "巻" is included within the r note, like "a ら巻"

rarely, but sometimes, there are also some plain breath files. these could be named anything, really, so youll have to find them. tokumei doesnt have any, but teto has b1 and b2. these dont sound amazing when you play them in utau, but the raw files are somewhere within the utau folder and you can drop them in when youre mixing the song, if you want

this is the temporary end of the cooler tutorial, until i find the time to write about more stuff :D

heres a good video to check out if you want to learn a little bit more about what i was talking about :)