This is an archived file from the Spring 2022 version of the course.
See the current course website for a more recent version.

Project 2: Sequence Alignment and Analysis

Due: Thursday, 24 February, 7:59pm

Collaboration and Resource Policy
For this assignment, you are encouraged to work with one other person. Your team must satisfy these constraints:
  1. You did not work together on Project 1.
  2. You and your partner have a total number of siblings that is divisible by two (e.g., if you have one sibling, you need to find a partner with 1, 3, 5, or 7 siblings. If anyone has more than 7 siblings, they can partner with anyone!)

We expect most students will have the best learning experience on this assignment by working with a partner, but if you prefer to work alone it is okay to do this assignment on your own.

You are permitted (actually encouraged) to discuss these problems with anyone you want, including other students in the class. If you do discuss the specific questions in the assignment with anyone other than your assignment partner and the course staff, though, you should list them in the External resources used section below.

You are welcome to use any resources you want for this assignment, other than ones that would defeat the purpose of the assignment. This means you should not look at answers or code from any other students in the class (other than your collaboration with your partner), and if you find code that implements the problem you are being asked to do for the assignment, you should not use that code. You should document all external resource you use that are not part of the course materials in the External resources used section in your submission.

Getting Started

To get started on Project 2:

  1. One of your team members should create a new repository named csbio-project2 (you can pick a different name if you really want to).

    • Visit
    • Select “Private” for the type of repository.
    • Keep the “Initialize this repository with a README” unchecked, since you will fetch it later from a public repository.

  1. Share this repository with your teammate:

    • Visit the repository’s page.
    • Click the “Settings” button on the right side.
    • Click the “Collaborators” tab.
    • Enter your teammate’s github id, and select the user from the dropdown.
    • Click “Add” with “Write” privileges.

  1. Add the course staff to your repository by following the same steps to add evansuva, hyunjaecho94, and iamgroot42 to your collaborators with “Read” privileges. Be aware that this will allow the course staff to see everything you put in the repository! If you are worried about having rants against the course staff in your repo, you could work in a different repository and copy your finished project into a new repo to submit it, but this seems like a lot of hassle. Hopefully you don’t have any rants about the course staff, but it is best to put those rants somewhere else and share your repo when you create it so you don’t forget to share it later.

  1. Clone the empty private repository to your working environment. Instead of mygithubname below, use your github username.
    git clone

You should see:

Cloning into 'csbio-project2'...
warning: You appear to have cloned an empty repository.
  1. Enter your csbio-project2 directory (cd csbio-project2).

  1. Fetch the assignment skeleton from our repository into your private repository. Enter the working directory of your empty repository and add a remote repository named course, merge the code, and push it to your private repository by executing:
   git remote add course
   git pull course main
   git push --tags origin main

(If your directory is not empty, you will need to add --allow-unrelated-histories to the git pull course main command.)

After finishing these steps, you should have a project2 directory that includes the project2.ipynb jupyter notebook you will use for this assignment.

You’ll be able to get started on the assignment by running: jupyter notebook project2.ipynb in the project2 directory.