Speech samples for the paper "Scaling and bias codes for modeling speaker-adaptive DNN-based speech synthesis systems" which is presented at IEEE SLT 2018 - Workshop on Spoken Language Technology.
A pre-print version of this paper can be found at https://arxiv.org/abs/1807.11632
More publications (arxiv pre-print, posters,...) about speaker adaptation and speech synthesis can be found at my website.
samples synthesized using WORLD vocoder
samples synthesized using Wavenet vocoder for selected strategies
1st sample
Non-linear | Linear | |||
---|---|---|---|---|
10 | 320 | 10 | 320 | |
Natural | ► Play | |||
Biasm | ► Play | ► Play | ||
Bias | ► Play | ► Play | not available | not available |
Scale | not available | not available | not available | not available |
Affine | not available | not available | ► Play | ► Play |
Level | not available | not available | ► Play | ► Play |
Bottle | not available | not available | ► Play | ► Play |
2nd sample
Non-linear | Linear | |||
---|---|---|---|---|
10 | 320 | 10 | 320 | |
Natural | ► Play | |||
Biasm | ► Play | ► Play | ||
Bias | ► Play | ► Play | not available | not available |
Scale | not available | not available | not available | not available |
Affine | not available | not available | ► Play | ► Play |
Level | not available | not available | ► Play | ► Play |
Bottle | not available | not available | ► Play | ► Play |