闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CPSLP Programming assignment

1 Introduction - Speech Synthesiser

Your task for this assignment is to create a speech synthesiser! Your Python program will take text input from a user and convert it to a sound waveform of intelligible speech. This will be a very basic waveform concatenation system, whereby the acoustic units are recordings of diphones. You will be provided with several ﬁles to help you do this:-

simpleaudio.py

This is a version of the simpleaudio .py module that we have used in the lab sessions. The Audio class contained therein will allow you to save, load and play .wav ﬁles as well as perform some simple audio processing functions. You should not modify this ﬁle.

synth args.py

A module that will handle commandline parsing for you. You should not modify this ﬁle.

synth.py

This is a skeleton structure for your program, with hints to get you going. You will see there are some classes and functions there already to start you off, but your task is to add more code to make it work. You are free to add any classes, methods or functions that you wish but you must not change the existing 2 class names, the function prototypes (i.e. the function/method names and their parameters) or argparse arguments. There are also several comments to provide guidance - please do delete these to tidy up your code once you have progressed with your solution.

diphones/

A folder containing .wav ﬁles for the diphone sounds. A diphone is a voice recording lasting from the middle of one speech sound to the middle of a second speech sound (i.e. the transition between two speech sounds). If you are unfamiliar with diphones, try listening to some of them to understand what this means. (Hint: you cannot rely upon the fact that all possible diphones have been given to you!)

examples/

A folder containing example .wav ﬁles to compare how your synthesiser sounds with respect to a reference implementation.

test synth.py

A suite of unittests for testing your implementation. This set of tests does not aim to test the *correctness* of any results, but instead merely aims to ensure the *API* works in the way expected by the automatic marking code (i.e. checks your code won’t break the automatic marking code), so you must ensure they run successfully before submitting your code. Any issues caused by your code failing to pass these pre-submission unittests will be your responsibility.

To set yourself up to work on this assignment most conveniently, just create a PyCharm project in this directory (i.e. the one created by unpacking the downloadable zipﬁle). Then simply add 2 conﬁgurations:

A standard run conﬁguration, which you should set up to run the synth .py script. Add initial commandline arguments “-p hello”, so that when you click the green run button, PyCharm will execute the following for you automatically:

python synth .py -p hello

That can be just to get you going; you can subsequently change the command line param- eters in the PyCharm project conﬁguration to suit whatever you are working on at the time. Or, alternatively, just run your program on manually yourself from the command line with whatever commandline arguments your like.

A standard unittest conﬁguration, pointing at the test synth .py ﬁle. Clicking the green run button when this conﬁguration is active will run the unittest suite provided for you.

2 Task 1 - Basic Synthesis

The primary task for this assignment is to design a program that takes an input phrase and syn- thesises it. The main steps in this procedure are as follows:-

normalise the text (convert to lower case, remove punctuation, etc.) to give you a straight- forward sequence of words

expand the word sequence to a phone (or “speech sound”) sequence – you should make use of nltk .corpus .cmudict to do this, which is a pronunciation lexicon provided as part of NLTK. It is already imported in the skeleton script for you, but you will need to ﬁnd out yourself what the cmudict object is, what it can do and so how to use it for your purposes. Don’t forget, utterances should always start and end with a silence phone! For example, saying the sentence “Hello!” requires a phone sequence of [PAU, HH, AH, L, OW, PAU] (where PAU is the label for the short pause or silence phone).

expand the phone sequence to the corresponding diphone sequence (e.g. to say “Hello!” we need [PAU-HH HH-AH AH-L...] etc.)

concatenate the necessary diphone wav ﬁles together in the right order to produce the required synthesised audio in an instance of the Audio class. Your aim here is to create a single new waveform (e.g. one array of audio data in an Audio class instance), and not just to quickly play one diphone sound after the other, for example.

A user should be able to execute your program by running the synth .py script from the command

line with arguments, e.g. the following should play “hello”:-

python synth .py -p "hello"

If a word is not in the cmudict then your program should take appropriate action (i.e. raise an exception with an informative message for the user).

You can listen to the examples hello .wav and rose .wav in the ./examples subdirectory, which were created as follows:-

python synth .py -o hello .wav "hello nice to meet you"

python synth .py -o rose .wav "A rose by any other name would smell as sweet"

If you execute the same commands with your program and the output sounds the same then it is likely you have a functioning basic synthesiser! You could also compare waveforms sample by sample, to ensure your synthesiser functions entirely like the reference implementation.

3 Task 2 - Extending the Functionality

Implement at least two of the following extensions (but note, the easy ones do not receive as much credit):-

Extension A – Volume Control [effort: 1 = easy]

Allow the user to set the volume argument (--volume, -v) to a value between 0 and 100 (minimum and maximum loudness respectively) to change the amplitude of the synthesised waveform.

Extension B – Speaking Backwards [effort: 2 = fairly easy]

Have fun by making the synthesiser speak backwards in the following three ways:

signal When the user gives the commandline option --reverse signal, switch the wave- form signal for the whole synthetic utterance back to front.

words When the user gives the commandline option --reverse words, reverse the order of the words that will be synthesised (e.g. "oh hi there" -> "there hi oh").

phones When the user gives the commandline option --reverse phones, reverse the order of the phones that will be spoken for the whole utterance.

Extension C – Synthesising a File [effort: 3 = medium]

Add appropriate code to enable the user to give the --fromfile ﬂag. The user can use this ﬂag to specify a text ﬁle containing text to synthesise – and in this case, your program should open the ﬁle with the given ﬁlename and synthesise whatever is contained there. Note, however, your program should process the text sentence by sentence. For example, your program should synthesise the ﬁrst sentence and then play it if the --play ﬂag is given, then synthesise the second, and then play that one, and so on. If the --outfile option is given by the user, then each of the synthesised waveforms must be concatenated to a single waveform prior to saving it at the end.

Extension D – Text Normalisation for Dates [effort: 4 = more challenging]

If the string contains dates in the form DD/MM, DD/MM/YY or DD/MM/YYYY, then convert them to word sequences, e.g. –

"John Lennon died on 8/12/80"

-> "john lennon died on december eighth nineteen eighty"

To keep things simple, assume your code will only need to handle years in the range 1900- 1999. [hint: you may wish to make use of the built-in datetime and/or re modules to help you do this.]

Extension E – Smoother Concatenation [effort: 4 = more challenging]

Simply pasting together diphone audio waveforms one after the other can lead to audible glitches where the waveform “jumps” at boundaries between diphones. Implement a simple way to alleviate this by “cross-fading” between adjacent diphones using a 10 msec overlap. To achieve a cross-fade, you need to lower the amplitude at the end of one diphone down to 0.0 over 10msec and then overlap and add in the signal from the start of the next diphone which is similarly tapered from 0.0 to normal amplitude over the same 10msec period. Note this will only mitigate one cause of “choppiness” in the synthetic speech and so can only do so much to make it sound better! Audio examples are provided for you to compare cross- faded concatenation with simple concatenation. Compare sizes of those ﬁles with ones produced by your code. to make sure you’re not losing or gaining any samples. [hint: you will probably ﬁnd numpy very useful to implement this extension in a succinct and efﬁcient way!]

4 Rules and Assessment

Your submission will comprise a single ﬁle of Python code and should abide by the following rules:-

all submissions must be written individually and be your own work.

the University’s penalties for plagiarism are potentially very harsh. Googling how to use a particular Python object or syntax feature is to be expected, and ﬁnding information in that way is no problem at all. If you ﬁnd a one-liner to achieve some neat “trick”, it’s also ﬁne to include that, as long as you attribute it (e.g. provide the URL for the StackOverﬂow page you found it on in a code comment for example). In contrast, cutting and pasting whole sections of code, or even whole functions/classes is deﬁnitely not in the spirit of the exercise and will be penalised. Note, Python code plagiarism detection software can spot copied code even if the names and formatting are changed. So, please, do just put into practice everything you’ve learned this semester and come up with your own code.

your submission may only use numpy, nltk, the provided ﬁles, and any packages that are built-in to Python. I must be able to run your code on my computer using just the one ﬁle and without installing anything else.

you may not change any of the existing argparse arguments provided in synth args .py – when you view that ﬁle you will see argparse has already been set up to take all the correct command-line arguments for you.

The assignments will be graded out of 100 and will be assessed using a combination of automatic testing and human markers according to the following criteria:-

Task 1 (total 30 marks) -

Ensure your system:-

is able to synthesise test phrases paying attention only to full words (ignoring abbrevi- ations, punctuation or numbers for example).

● is reasonably robust - for example it can handle out of vocabulary words, or malformed input, or a missing “diphones” directory, or even missing diphones, in an elegant and/or informative way rather than just falling over or breaking

● is able to play and/or save the output to a ﬁle

IMPORTANT: just because your code might run and work as speciﬁed above, it does not mean it will necessarily get full marks! Many aspects of your code will be considered beyond simply whether it works or not, including for example:

has the task clearly been understood with clear evidence of a sensible attempt to im- plement solution?

does it work entirely as speciﬁed?

is it efﬁcient?

is the code well-structured, sensible and with good clear logic?

is the code legible, sufﬁciently documented and “friendly” for others

is the code robust?

is the code written in a succinct and pythonic way?

Task 2 (total 30 marks) -

Implement at least two of the extensions. Note that the extensions clearly vary in terms of the effort required to implement them. Extension A is far simpler than Extension D or E, for example. The marks available reﬂect the effort required for each of the extension tasks (indicative marks: easy (2); fairly easy (4); medium (7); challenging (10)). To aim for a higher grade, you can choose to implement more of the extensions, taking care to present strong solutions (three extensions implemented to a high standard could achieve the same grade as 4 extensions implemented less well, for example). Again, keep in mind key characteristics such as functionality, design, maintainability, efﬁciency, and robustness.

General Criteria (total 40 marks) -

Further marks will be awarded based on how well your code is designed and presented overall. Therefore, take care to use appropriate modular design as well as suitable code formatting, naming and plenty of appropriate comments and docstrings. You can also earn more credit by being ‘pythonic’ (i.e. keeping your code succinct, well-organised and easily- readable; good use of nice Python language features (e.g. comprehensions); etc.). (Criteria weighting: Design (10); Readibility (10); Pythonicity (10); Documentation (5); Robustness (5)).

You will be required to submit your modiﬁed synth .py ﬁle as arranged by the Teaching Ofﬁces through Learn once you have ﬁnished.

Submissions are marked anonymously. You must prepend your exam number (the ‘B’ number on your student card) to the ﬁle prior to submitting it, e.g. your submitted ﬁle should be in the format:

B123456_synth.py

In addition, do not include any identifying content (e.g. Matriculation number, name etc.) in the submitted ﬁle. Submit only one Python code ﬁle in the above format (and not a zip ﬁle for example).

Finally, if you should have any queries about the task, or questions more generally, then please do not hesitate to ask!

2022-11-28

Java

物理(Physical)

LINUX

C++

Python

Processing

sas

ios

maths

maple