# Homework 6¶

import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set(rc={"figure.dpi":100, 'savefig.dpi':300})
sns.set_context('notebook')
sns.set_style("ticks")
from IPython.display import set_matplotlib_formats
set_matplotlib_formats('retina', 'svg')
import requests
import os
"""
Downloads the file in the url and saves it in the current working directory.
"""
data = requests.get(url)
if local_filename is None:
local_filename = os.path.basename(url)
with open(local_filename, 'wb') as fd:
fd.write(data.content)


## Problem 1 - Loops and conditionals¶

Consider the following list:

data = [1, 4, 3, 10, 4, 3, 4, 4]

• Write a loop that computes the average of the elements in the list and print the result using two significant digits.

# your code here

• Write code that finds the number of times the element 4 occurs in the list. Hint: Use a loop and an if-statement.

# your code here

• Write a Python function that takes a list as an argument and returns the number of times a given element (also passed as an argument to the function) appears in the list. Call that function find_number_of_occurences(a, elm). Make sure you follow best practices when writing the docstring of your function.

# your code here

# Try your code here:
help(find_number_of_occurences)

# Try your code here:
find_number_of_occurences(data, 4)

• Write a Python function that takes a list as an argument and returns the number of elements that are greater than a given element (also passed as an argument to the function). Call that function find_number_of_elms_greater_than(a, elm). Make sure you follow best practices when writing the docstring of your function.

# your code here

# Try your code here:
help(find_number_of_elms_greater_than)

# Try your code here:
find_number_of_elms_greater_than(data, 3)


## Problem 2 - High-performance buildings revisited¶

In this problem we will continue analyzing the high-performance buildings dataset we introduced in Problem 1 - High-performance buildings and with which we played in Selecting dataframe rows that satisfy a boolean expression. Let me set you up by downloading and cleaning the data file:

url = 'https://raw.githubusercontent.com/PurdueMechanicalEngineering/me-297-intro-to-data-science/master/data/temperature_raw.xlsx'
import pandas as pd
df = df.dropna(axis=0)
df.date = pd.to_datetime(df['date'], format='%Y-%m-%d')

+ Plot the external temperature t_out

• Extract the data pertaining to household a5. Put the result in a new dataframe called df_a5.

df_a5 = # Your code here

• For household a5, plot t_unit as a function of date.

# Your code here

• Do the same figure, plot date vs t_unit for households a5 and a11.

# Your code here

• In the same figure, plot the t_out and t_unit scatter plots for both households a5 and a11.

# Your code here

• In the same figure, plot the t_out and hvac scatter plots for both households a5 and a11.

# Your code here

• In the same figure, plot the histogram of t_unit for households a5 and a11. Which household prefers cooler temperatures? Hint: To make the histogram more appealing use the keywords density=True, alpha=0.25.

# Your code here


• In the same figure, plot the histogram of hvac for households a5 and a11. Which household is more energy efficient (if any) and why?

# Your code here


• Repeat the analysis above for households b17 and c40. Which household prefers cooler temperatures and which one is more energy efficient?
# your code here