GradePack

    • Home
    • Blog
Skip to content
bg
bg
bg
bg

GradePack

Suppose you found the house that you have been looking for y…

Suppose you found the house that you have been looking for years! It comes with a price tag of $285,000. As luck would have it, you also found a bank which is running special where they will finance your house if you put down $25,000 down payment. The loan comes with a 7.5% APR over 30 years. If you decide to take up the offer and put up the down payment, how much would be your monthly payment amount? (Round up your answer to two decimal point)                                                                   Monthly payment Amount: $[pmt] Using the table below, fill in the answer for every box and show how you got the numbers (meaning, write the calculations to show how you got each of the answer). Month Beg. Loan Bal. (Show how you got the number) Monthly Payment  (No need to show how you got the number) Applied to Interest (Show how you got the number) Applied to Principal (Show how you got the number) Ending Loan Bal (Show how you got the number) 1st Month $[loan] $[pmt] $[1int] $[1prin] $[1endbal] 2nd Month $[2nd] $[pmt] $[2int] $[2prin] $[2endbal]

Read Details

Problem A.1 (10 Points) Two-point charges 1nC and -1nC are p…

Problem A.1 (10 Points) Two-point charges 1nC and -1nC are placed at (5,3,1) and (2,0,1), respectively. An infinite sheet charge parallel to the x-z plane is placed at y = -5. The sheet has uniform charge density of 4 pC/m2. Find the net electric field intensity at the origin if the dielectric constant in the region containing the charges and the observation point is 2.   Problem A.2 (5 Points) Design a coaxial capacitor with capacitance 10 pF under the following constraints. Note there are multiple answers to this problem. Dielectric constant of the insulator inside the capacitor is 6. Minimum radius for inner conductor is 10mm. Maximum radius for outer conductor is 30mm.

Read Details

Suppose you found the house that you have been looking for y…

Suppose you found the house that you have been looking for years! It comes with a price tag of $285,600. As preparation for this very moment you have been saving for the last five years and managed to save $20,000 which you plan to put down as a down payment for the house. If the mortgage company is willing to finance your house with a 5.5% APR, 20-year loan, then what would be your monthly payment? $[pmt1] (Round up your answer to two decimal point) Using the monthly payment, fill out all the blank boxes below. Fill in the answer for every box and show how you got the numbers in the box (meaning, write the calculations to show how you got each of the answer). Months Beg Loan Bal.(Show how you got the number) Monthly Payment (No need to show how you got the number) Applied to Int.(Show how you got the number) Applied to Prin.(Show how you got the number) End Loan Bal.(Show how you got the number) 1st Month $[loan] $[pmt1] $[1int] $[1prin] $[1endbal]  

Read Details

Assignment: Reread your submission for the very first writin…

Assignment: Reread your submission for the very first writing assignment of the course, your memoir about starting college. Examine how the writing appeals to readers and communicates a message. Then use the writing process to draft a reflection on whether your memoir achieved your writing goals. Consider and address the rhetorical concepts and writing strategies we’ve covered this semester. Instructions: Draft and document your writing process in the fields provided. Your portfolio will include the following:  Printed copy of memoir Brainstorming Thesis statement and brief outline Rough draft (or a more developed, detailed outline) Revised draft 

Read Details

Find the product.(y – 10)(y + 1)

Find the product.(y – 10)(y + 1)

Read Details

Factor the four-term polynomial by grouping.x3 + 7×2 + x + 7

Factor the four-term polynomial by grouping.x3 + 7×2 + x + 7

Read Details

Honorlock has trouble-shooting and tech help support phone n…

Honorlock has trouble-shooting and tech help support phone number and online chat to help you 24 hours a day. 

Read Details

Calculate the integral by first reversing the order of integ…

Calculate the integral by first reversing the order of integration

Read Details

When a patient stands up, what is the hydrostatic pressure a…

When a patient stands up, what is the hydrostatic pressure at the ankles?

Read Details

Question 1 ML acceleration [22 points] [25 minutes] (1.1) [6…

Question 1 ML acceleration [22 points] [25 minutes] (1.1) [6 points] Why is the TPU’s power lower than the GPU’s? How can the TPU perform as well or better than the GPU without GPU’s overheads? (1.2) [4 points] Why is a larger systolic array better for MM in ML? Why does a systolic array not become inefficient (i.e., remain efficient) with size? (1.3) [4 points] What is the key advantage of unstructured sparsity compared to structured sparsity in ML models? What end benefits does the advantage lead to? (1.4) [8 points] Assuming unstructured, one-sided (weights-only), sparse ML models have 25% density (i.e., 75% weights are zeros), how would you modify the TPU shown below to handle specifically (a) MAC underutilization and (b) load imbalance across cells? Assume the input and output activations are dense and that a weight may be displaced down by at most one cell to achieve load balance (similar to Eureka). Recall that the TPU computes inner product with weights held stationary in the MAC systolic array, the dense input (output) activations streamed in (out) from the left (bottom). (a) (4 points) MAC underutilization: (b) (4 points) Load imbalance across cells: Modify the above figure. Question 2 PIM [24 Points] [25 minutes] (2.1) (4 points) What key workload characteristic would make a memory‐bound workload unfit for PIM? Be specific. (2.2) (4 points) In an MV multiplication on a Newton‐like PIM, what operation occurs in parallel with a column read of the matrix? What is the cost of the avoiding that operation? (2.3) (6 points) What does Newton’s interleaved layout achieve? What does the layout lose? Why is the loss acceptable? (2.4) (10 points) Assume (1) o is the DRAM bank activation time (time to read a bank’s row into the bank’s row buffer) (2) all n banks can be activated in parallel (without any tFAW{“version”:”1.1″,”math”:”tFAW”} restrictions), and (3) compute/read time per column is tCOL{“version”:”1.1″,”math”:”tCOL”} and there are c columns per DRAM row. Ignore all other overheads in an MV computation. (a) (5 points) What is the time to compute one DRAM row across all banks in a Newton‐like PIM? (b) (3 points) What is the time to compute one DRAM row across all banks in a non‐PIM (e.g., GPU + standard DRAM) assuming the only exposed time is the time to read the row out one column at a time in a standard DRAM? (c) (2 points) If we assume that o = c/4 * tCOL{“version”:”1.1″,”math”:”tCOL”} what is the speedup of PIM over non‐PIM? Question 3 Network Acceleration [6 Points] [5 minutes] (3.1) (3 points) What is the key performance requirement in network routers? (3.2) (3 points) What key flexibility does programmable router bring to networks? Question 4 Polynomial accelerator [28 Points] [30 minutes] Purdue CompE ML faculty have had a breakthrough! They have invented polynomial- based models which are far more accurate than the standard matrix-based models. In these new models, the key compute primitive is modulo polynomial multiplication. Each model involves computing trillions of this primitive for polynomial filters and features. The polynomials are of degree < 32 (degree is the power of the polynomial’s highest power term with a non-zero coefficient). Modulo multiplication here is polynomial multiplication followed by modulo using the fixed, simple, constant polynomial x32{"version":"1.1","math":"x32"} so that the result is a polynomial of degree < 32. (Yes, I can make up workloads so that the problem is neither too easy nor too hard.) For example, (x3+4x+2)*(2x2+3){"version":"1.1","math":"(x3+4x+2)*(2x2+3)"} modulo x4{"version":"1.1","math":"x4"} is (11x3+4x2+14x+6){"version":"1.1","math":"(11x3+4x2+14x+6)"}. We wish to build hardware for this primitive. This question has nothing to do with FHE or FHE’s NTT. Assume the coefficients are FP16 and are arranged in decreasing power-term order for each polynomial and the powers are non-negative integers < 32. (4.1) (4 points) What is the space and time complexity of polynomial multiplication (without any modulo)? Define the parameter in your complexity measure. Is polynomial multiplication compute- or memory-bound? (4.2) (4 points) How is one polynomial multiplication without any modulo similar to one MM? How is it different? Similar: Different: (4.3) (4 points) How would you address in hardware Q4.2’s “different” aspect? What key property of polynomial multiplication simplifies handling this aspect? (4.4) (4 points) How does including modulo x32{"version":"1.1","math":"x32"} change your solution to Q4.3? (4.5) (6 points) What is your accelerator organization for one polynomial multiplication with modulo? What is your basic strategy to implement the multiplication? While sequential designs are unacceptable, an unoptimized parallel design is acceptable. Optimizations are left for Q4.6. (4 points) Draw your accelerator organization. Label the blocks and inputs with well-known terms without digital logic/circuit-level details. (2 points) Describe your strategy in terms of what happens to the coefficients and powers of each polynomial in your accelerator? You may describe your strategy using various components of your accelerator. (4.6) (2 points) What are two key difficulties faced by your design? These difficulties are common to other accelerators as well. (4.7) (4 points) How would you solve each difficulty? How would your building block change? Question 5 And finally, a non-MM model [20 Points] [30 minutes] Gradient boosting trees (GBT) is a well-known, high-accuracy, non-MM model for classifying table-based data with numerical and non-numeric, categorical fields (e.g., gender, race, education, state of residence). Such table-based data is prevalent in the real world (e.g., relational databases and spreadsheets). Each record in a table has multiple numerical and categorical fields (e.g., 100 fields). GBT uses an ensemble of multiple weak models (e.g., 500 shallow, 5-deep binary decision trees) to produce a strong model. GB training involves binning the training data records into small histograms based on each field (e.g., a small 256-entry histogram for each of 100 fields). Binning simply increments the matching bin’s counter. While the numerical fields map to 256 bins, categorical fields may map to fewer bins (e.g., yes/no fields use two bins). For simplicity, we skip the other steps in building the decision trees based on the histograms. The histogram counts use 64 bits (8 bytes) (so a histogram is 256*8B = 2KB). In inference, each input record traverses many (e.g., 500) shallow decision trees whose decisions are combined for the final inference (the combining details are ignored for simplicity). A 5-deep binary decision tree has at most 1+2+4+8+16+32= 63 nodes where each node computes a simple decision (e.g., field4 < 25, field6 == true) which may be encoded using at most 8 bytes (so a tree is at most 64*8B = 512 bytes). The trees need not be a full binary tree but this detail can be ignored. We wish to build an accelerator for GBT that accelerates both training and inference with most of the same hardware. (5.1) (4 points) Where is the parallelism in GBT training and in inference? Training:Inference: (5.2) (4 points) Where are the dependencies in GBT training and in inference? Training:Inference: (5.3) (6 points) Why would GPUs not work for GBT training or inference? Training:Inference: (5.4) (6 points) What key observation lets you use most of the same hardware for training and inference? Describe your accelerator organization (analogous to “a 128x128 MAC array where each MAC computes a weight-activation product in training and inference”). Congratulations, you are almost done with the Final Exam.  DO NOT end the Honorlock session until you have submitted your work.  When you have answered all questions:  Use your smartphone to scan your answer sheet and save the scan as a PDF. Make sure your scan is clear and legible.  Submit your PDF as follows: Email your PDF to yourself or save it to the cloud (Google Drive, etc.).  Click this link to submit your work: Final Exam Return to this window and click the button below to agree to the honor statement. Click Submit Quiz to end the exam.  End the Honorlock session. 

Read Details

Posts pagination

Newer posts 1 … 35,100 35,101 35,102 35,103 35,104 … 87,395 Older posts

GradePack

  • Privacy Policy
  • Terms of Service
Top