Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Biol 458 Midterm Project

Important Instructions about Collaboration The restrictions for this activity are very similar to those of your weekly lab activities.  Please feel free to discuss the project with others, ask your instructor questions (even publicly via piazza), and to share suggestions on coding strategies and R functions with your fellow students. However, as you prepare your work for this assignment, you’ll end up creating a script of commands.  Please make sure the contents of this script remain private.  You should never be looking at someone else’s script, nor should you be showing your script (or subsets thereof) to others. If you wish to demonstrate a technique to a classmate (e.g. walk them through the workings of a for loop), please try to prepare a code snippet that has nothing to do with this project or your own script. When in doubt, please feel free to ask for my opinion on whether something ”crosses the line”.

An Overview of the Project

The current year is 2010.  The fictional city of Linearville (LV for short) has been ex- periencing remarkably linear growth in human population density over time. However, this growth has had a negative impact on the populations of several local fauna often seen in and around the city.  Based on current trends, some of these fauna may become locally extinct in the area around Linearville.

Your mission is to take a closer look at four species in particular: squirrels, skunks, coy- otes, and foxes. Each of these species has been experiencing a decline in population density, and your goal is to identify the single species that is predicted to be the first to become locally extinct in the area around Linearville. Once this at-risk species has been identified, your goal will be to set up a protected habitat for this species (with an established carrying capacity) some distance from Linearville.  Once the population in this habitat reaches a mature level, the plan is to routinely transfer animals from the habitat to the area around Linearville (to help the species maintain its local population).

One of your colleagues has proposed a plan for the establishment of this habitat.  The plan is based on the population growth behaviour of the species under ‘idealized conditions’ (i.e. far from Linearville). Your goal will be to determine whether the plan should, in theory, allow you to start supporting the local population of your at-risk species around Linearville  before that local population goes extinct.  It’s possible the plan will be too slow – in that case, you need to point out that a more aggressive plan may be needed.

High-Level summary of your tasks

1. Read in the four data files corresponding to population histories around Linearville (marked‘LV’).

2. For each of the four species, plot a graph of population vs time,  and fit a linear regression line modeling the relationship between population and time.

3. Use the regression equation to predict when each of the four species will go locally extinct.

4. Select the species that is predicted to become locally extinct the soonest as the“at-risk” species.

5. Read in the data file for the population growth history of the at-risk species under ‘ideal conditions’.

6. Determine the empirical year-by-year growth rates (lambdas) under ideal conditions, and use those values to estimate a (geometric) mean.

7. Use this geometric mean to predict the likely trajectory of population growth in a conservation habitat with a fixed (logistic) carrying capacity.

8. Establish when (i.e. how many years into the project) this predicted population growth rate would be maximal.

9. Determine whether this maximal growth date occurs before or after the local population around linearville is predicted to go extinct.

A Remark about the Data Sets

I’ve set up a program that allows me to randomly generate data sets for this project. Together with this document, I’ll post three different versions of the data set for you to experiment with. These versions will have numbers that result in different outcomes – each of the four species has a chance of being identified as the species ”at risk”, and the proposed conserva- tion plan won’t always be ready in time to bolster the local population around Linearville. I’ve also provided solution cards” for these three data sets, highlighting some of the key values you should calculate and the expected outcome of the conservation effort.

The goal of this project is to construct a script that is sufficiently “general purpose”that it can be run on any data set that is provided for you. In other words, all you should need to do is switch the“input files” – the script should do the rest of the work to lead you to the expected outcome of the conservation effort, given the provided information.

As part of your assessment, I’ll ask you to contact me when you think your script is robust and ‘ready’.   Once I receive this communication, I’ll send you one last personal- ized (randomly generated) data set for you to analyze.  If all goes well, you should just need to run your script and send me the results!  But just in case something goes wrong, I’ll give you a window of time to adjust your script and its output (should that be necessary).

Deliverables and Assessment

Here’s what I’m expecting you to submit for this project (and how I’ll assign points to your work).

Please submit:

1. Your script. I’ll assign it a grade between 5 and 0. This grade will be influenced by: how easily I can follow your work, how much “manual intervention” is required in your script, and the uniqueness of your script (i.e. whether it looks too similar to the work of a classmate).

2. A document of important results/statistics derived from your personalized data set. You can use my “solution card” documents as a guide for the sorts of things you could report.  I’ll assign this a grade between 5 and 0 based on:  correctness of the results, comprehensiveness of the results, legibility of the document.   You need not report your findings in complete sentences just make it reasonably easy for the reader to determine what everything means.

3. For your at-risk species specifically, a plot of the population history around Linearville (together with linear regression line). I’ll assign this a grade between 2 and 0 based on presentation and accuracy. Feel free to make it pretty.

4. For your at-risk species specifically, a scatterplot of the population growth history under‘idealized’conditions, supplemented with a line tracking the predicted population growth under a logistic model.   I’ll assign this a grade between 3 and 0 based on presentation and accuracy. Feel free to make it pretty.

In summary, you’ll be assigned a grade out of 15. Up to 5 for the script, up to 5 for the stats from your personalized run, and up to 5 for your supporting graphs.

Step-by-Step Walk-through

Phase 1:  Determine which species is at-risk . You’ll see four files with LV”in the file name. These correspond to the population densities of each of the four species in the area of Linearville. Each file will have two columns of data – one column of years, and one column of recorded population densities. The interval of years is mostly consistent across files (dates between 1960 and 2010, the present year), but the files may differ in which specific years have data available.

For each of these files, you’ll want to read the data into R using read.csv.   As an in- termediate step, it will probably help to examine a scatterplot (plot) of each data file, with years on the x-axis and population densities on y . In order to determine when you predict the species to go locally extinct, you’ll need to construct a linear regression model using lm.  Recall that lm uses a slightly different notation style in its arguments – if your plot is plot(x=xstuff,  y=ystuff), then the corresponding linear model call should look like lm(ystuff xstuff). You’ll need to examine the output of lm; recall that it’s a list, and that one of the components of the list is a set of regression coefficients.

In order to determine when a species is predicted to go locally extinct, we’ll want to follow the regression line until it crosses the x-axis – in other words, we want to find the x-intercept of the line. Recall that the regression model is of the form y = mx + b, where m and b are coefficients from your regression model.  In order to find the x-intercept, you want to find the x value (year) that causes mx + b to equal zero. To put it another way, you’ll have an equation of the form 0 = mx + b and you’ll need to solve for x.

Once you’ve obtained the x-intercepts for each of the four species, you’ll be able to de- termine which species should be considered “at-risk” for the project (whichever one has the lowest x-intercept).  Note this species before proceeding to the conservation phase of the work. To summarize your work in the first phase of the project, please note that you should prepare at least one scatterplot (of population around Linearville versus time) for the species you’ve deemed to be at-risk”.  This scatterplot should include a regression line; you may wish to consider placing the regression equation somewhere in or around the plot as well. While this is the only plot I’m expecting, you may also find it interesting/informative to construct scatterplots for the other three species as well – if nothing else, it might help convince you that you’ve made the right choice about your species of focus. Also note that you’ll be producing a document of key results/statistics for your report – in phase 1, you’ve established which species is  “at-risk” and you’ve determined how much time you should have before that species goes extinct (as determined by the x-intercept from your regression model).

Phase  2:   Simulating the  behaviour  of the  at-risk  species  in  a  conservation project. Select the“Wild” file for the species you’ve determined is “at-risk”. This file will again contain two columns of data – one column for “year” and one column giving an ob- served population count. These are records of how a population of this species was observed to grow under “idealized” conditions. You’ll be trying to achieve something comparable in your conservation habitat, but with the added constraint of a maximum carrying capacity.

To begin, first calculate a set of year-by-year observed growth rates. Recall that each indi-

vidual year-by-year λ can be found using λ =  ( ).  Next, determine geometric mean

of this set of λ values. Recall that this is most easily done by logging your lambda values, finding the mean of the logged values, and then exponentiating the result.  Note that the growth rate (stated in terms of r and λ) is reported in my provided solution card for each data set – please check your work.

Next, we’ll simulate the population growth of the “at-risk” species in our conservation area using the following assumptions:

1. The initial population matches the initial population (year 1) of the “wild”input file.

2. The carrying capacity for the conservation habitat, K, will be equal to the largest population value seen in the “Wild” input file.

3. Our prediction will run until year 40 – in addition to the “Year 1”population count (which matches year 1 of the “Wild”file), we’ll generate 39 additional years of predicted population counts.

4. To predict next year’s population from the previous year’s population, we’ll use the discrete logistic growth model. Specifically:

Nt+1  = Nt ∗ (1 + ln(λavg ) ∗ (1 − Nt /K))

or equivalently:

Nt+1  = Nt ∗ (1 + ravg  ∗ (1 − Nt /K))

where λavg  is the geometric mean of your lambda values and ravg  is the corresponding instantaneous rate. This would be an excellent opportunity to use a “loop”structure.

If all goes well, your predicted population in year 40 should match the value reported in my solution card.

Once you’re confident that you’ve predicted the population growth correctly (under logisitic conditions), determine which year (if any) was the first to have a population count exceed- ing (strictly greater than) K/2. As you may recall, this population level corresponds to the inflection point of the logistic growth curve – it is the population level that causes the great- est instantaneous rate of population increase.  For this reason, it will serve as our marker for when the population in the conservation region is “mature” and ready for harvesting. In essence, we could determine how many animals would be added (on average) in a year that started with population of K/2, and export that many animals from the habitat to Linearville instead.

Lastly, you can compare the date at which your conservation project reaches a population of K/2 to the predicted year at which the local population around Linearville was predicted to go extinct (based on your regression model from phase 1). Keep in mind that the present date is 2010 – so your predicted  “year 1”population would occur one year in the future (2011), and the final predicted year (year 40 from your simulation) would be year 2010+40 = 2050. If the conservation project should be ready before the local population is predicted to go extinct, that’s great – the conservation plan can be implemented as proposed.  How- ever, if the local population is predicted to go extinct before the conservation project would be ready for harvesting, you should note that the plan needs to be modified.

To summarize your work in the second phase of the project, please note that you should prepare a short report (text file) documenting which species was at risk, its expected date of extinction around LV, and when the planned conservation project would be ready to sup- plement that population. You can use my solution card as a guideline for the sort of data you could report. Please also provide a plot showing two things:

1. the projected population growth in your conservation habitat  (i.e.   your predicted population values from the discrete logistic growth model), and

2. the population count values from the 16 years of “idealized”growth.

To fit both of these conveniently on the same plotting surface, you could start with a scat- terplot of the contents of the “Wild” file using a plot command that includes the following argument: xlim=c(0, 40). This will set up the plot to have enough space on the x-axis to include the 40 years of data from your discrete logistic growth simulation.