{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "(lecture04:homework)=\n", "# Homework 4\n", "\n", "## Instructions\n", "\n", "+ Type your name and email in the \"Student details\" section below.\n", "+ Develop the code and generate the figures you need to solve the problems using this notebook.\n", "+ For the answers that require a mathematical proof or derivation you can either:\n", " \n", " - Type the answer using the built-in latex capabilities. In this case, simply export the notebook as a pdf and upload it on gradescope; or\n", " - You can print the notebook (after you are done with all the code), write your answers by hand, scan, turn your response to a single pdf, and upload on gradescope.\n", "\n", "+ The total homework points are 100. Please note that the problems are not weighed equally.\n", "\n", "```{note}\n", "+ This is due before the beginning of the next lecture.\n", "+ Please match all the pages corresponding to each of the questions when you submit on gradescope.\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Student details\n", "\n", "+ **First Name:**\n", "+ **Last Name:**\n", "+ **Email:**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let me set you up with some nice code for plotting and downloading files." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "import seaborn as sns\n", "sns.set(rc={\"figure.dpi\":100, 'savefig.dpi':300})\n", "sns.set_context('notebook')\n", "sns.set_style(\"ticks\")\n", "from IPython.display import set_matplotlib_formats\n", "set_matplotlib_formats('retina', 'svg')\n", "\n", "import requests\n", "import os\n", "\n", "def download(url, local_filename=None):\n", " \"\"\"\n", " Downloads the file in the ``url`` and saves it in the current working directory.\n", " \"\"\"\n", " data = requests.get(url)\n", " if local_filename is None:\n", " local_filename = os.path.basename(url)\n", " with open(local_filename, 'wb') as fd:\n", " fd.write(data.content)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem 1 - Visual analysis of a variable-speed compressor experiment\n", "\n", "In this problem we are going to need [this](https://raw.githubusercontent.com/PurdueMechanicalEngineering/me-297-intro-to-data-science/master/data/compressor_data.xlsx) dataset. The dataset was kindly provided to us by [Professor Davide Ziviani](https://scholar.google.com/citations?user=gPdAtg0AAAAJ&hl=en).\n", "As before, you can either put it on your Google drive or just download it with the code segment below:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
T_eDT_shT_cDT_scT_ambfm_dotm_dot.1CapacityPowerCurrentCOPEfficiency
0-3011258356028.88.00000015579014.41.730.467
1-3011308356023.06.38888912018814.01.360.425
2-3011358356017.94.9722228928583.71.040.382
3-2511258356046.412.888889250911255.32.230.548
4-2511308356040.211.166667209811225.11.870.519
..........................................
6010114583560245.268.11111112057252511.34.780.722
6110115083560234.165.02777810939274012.33.990.719
6210115583560222.261.7222229819292913.13.350.709
6310116083560209.358.1388898697309113.72.810.693
6410116583560195.454.2777787575322314.22.350.672
\n", "

65 rows × 13 columns

\n", "
" ], "text/plain": [ " T_e DT_sh T_c DT_sc T_amb f m_dot m_dot.1 Capacity Power \\\n", "0 -30 11 25 8 35 60 28.8 8.000000 1557 901 \n", "1 -30 11 30 8 35 60 23.0 6.388889 1201 881 \n", "2 -30 11 35 8 35 60 17.9 4.972222 892 858 \n", "3 -25 11 25 8 35 60 46.4 12.888889 2509 1125 \n", "4 -25 11 30 8 35 60 40.2 11.166667 2098 1122 \n", ".. ... ... ... ... ... .. ... ... ... ... \n", "60 10 11 45 8 35 60 245.2 68.111111 12057 2525 \n", "61 10 11 50 8 35 60 234.1 65.027778 10939 2740 \n", "62 10 11 55 8 35 60 222.2 61.722222 9819 2929 \n", "63 10 11 60 8 35 60 209.3 58.138889 8697 3091 \n", "64 10 11 65 8 35 60 195.4 54.277778 7575 3223 \n", "\n", " Current COP Efficiency \n", "0 4.4 1.73 0.467 \n", "1 4.0 1.36 0.425 \n", "2 3.7 1.04 0.382 \n", "3 5.3 2.23 0.548 \n", "4 5.1 1.87 0.519 \n", ".. ... ... ... \n", "60 11.3 4.78 0.722 \n", "61 12.3 3.99 0.719 \n", "62 13.1 3.35 0.709 \n", "63 13.7 2.81 0.693 \n", "64 14.2 2.35 0.672 \n", "\n", "[65 rows x 13 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "url = 'https://raw.githubusercontent.com/PurdueMechanicalEngineering/me-297-intro-to-data-science/master/data/compressor_data.xlsx'\n", "download(url)\n", "\n", "import pandas as pd\n", "data = pd.read_excel('compressor_data.xlsx')\n", "data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The data are part of a an experimental study of a variable speed reciprocating compressor.\n", "The experimentalists varied two temperatures $T_e$ and $T_c$ (both in degrees C) and they measured various other quantities.\n", "Our goal is to understand the experimental design and develop some understanding of the map between $T_e$ and $T_c$ and measured Capacity and Power (both in W).\n", "Answer the following questions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ Do the scatter plot of $T_e$ and $T_c$. This will reveal the experimental design picked by the experimentalists. Make sure you label the axes correctly. Hint: These are columns `T_e` and `T_c` of the data frame `data`. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# your code here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ Is there a gap in the experimental design? If yes, why do you think they have a gap?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Your explanation here.*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ Do the scatter plot between `T_e` and `Capacity`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# your code here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ Do the scatter plot between `T_c` and `Capacity`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# your code here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ Do the scatter plot between `T_e` and `Power`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# your code here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ Do the scatter plot between `T_c` and `Power`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# your code here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ We are lucky that we only have two experimental control variables because can do a bit more thing with scatter. You can color each point in the scatter plot according to a scale that follows an output variable. Let me show you what I mean by doing the plot for the `Capacity`." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "image/svg+xml": [ "\n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", " 2021-06-23T17:52:35.824909\n", " image/svg+xml\n", " \n", " \n", " Matplotlib v3.3.4, https://matplotlib.org/\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n" ], "text/plain": [ "
" ] }, "metadata": { "image/png": { "height": 372, "width": 560 } }, "output_type": "display_data" } ], "source": [ "from matplotlib import cm\n", "fig, ax = plt.subplots()\n", "cs = ax.scatter(data['T_e'], data['T_c'], # So far a standard scatter plot\n", " c=data['Capacity'], # This is telling matplotlib what the color\n", " # of the points should be\n", " cmap=cm.jet # This is saying to use the jet colormap\n", " # (blue = smallest values, red = highest values)\n", " )\n", "plt.colorbar(cs, label='Capacity') # This gives us a colorbar\n", "ax.set_xlabel('$T_e$')\n", "ax.set_ylabel('$T_c$');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now repeat the same thing for the `Power`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# your code here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem 2 - Visual analysis of an airfoil experiment\n", "\n", "In this problem, you are going to repeat what you did in Problem 1, but without my guidance!\n", "\n", "The dataset we are going to use is the [Airfoil Self-Noise Data Set](https://archive.ics.uci.edu/ml/datasets/Airfoil+Self-Noise#)\n", "From this reference, the descreption of the dataset is as follows:\n", "\n", "> The NASA data set comprises different size NACA 0012 airfoils at various wind tunnel speeds and angles of attack. The span of the airfoil and the observer position were the same in all of the experiments.\n", "> \n", "> Attribute Information:\n", "> This problem has the following inputs:\n", "> 1. Frequency, in Hertzs.\n", "> 2. Angle of attack, in degrees.\n", "> 3. Chord length, in meters.\n", "> 4. Free-stream velocity, in meters per second.\n", "> 5. Suction side displacement thickness, in meters.\n", "\n", "> The only output is:\n", "> 6. Scaled sound pressure level, in decibels.\n", "\n", "Before we start, let's download and load the data.\n", "I am going to put them in a dataframe for you." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00291/airfoil_self_noise.dat'\n", "download(url)\n", "raw_data = np.loadtxt('airfoil_self_noise.dat')\n", "df = pd.DataFrame(raw_data, columns=['Frequency', 'Angle_of_attack', 'Chord_length',\n", " 'Velocity', 'Suction_thickness', 'Sound_pressure'])\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ Do the histogtrams of all variables. Use as many code segments you need below to plot the histogram of each variable in a different plot. Make sure you label the axes correctly." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# your code here (as many blocks as you like)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ Do the scatter plot between all input variables. This will give you an idea of the range of experimental conditions. Are there any holes in the experimental dataset, i.e., places where you have no data?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# your code here (as many blocks as you like)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Your explanation here*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ Do the scatter plot between each input variable and the output. This will give you an idea of the relationship between each input and the output. Do you observe any obvious patterns?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# your code here (as many blocks as you like)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Your explanation here*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ Now pick the two input variables you think are the most important and do the scatter plot between them using the output to color the points (see the last question of Problem 1). Feel free to repeat it with more than two pairs of inputs if you want. Briefly discuss your findings." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# your code here (as many blocks as you like)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Your explanation here*" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 4 }