GradePack

    • Home
    • Blog
Skip to content
bg
bg
bg
bg

GradePack

Internet_Scale_Computing_1c Giant-Scale Services   The conte…

Internet_Scale_Computing_1c Giant-Scale Services   The context for this question is the same as the previous question. 1. You are building a planetary  store, where:  The key is a unique sensor ID for each sensor deployed across the world. The value is information collected by the sensor such as environmental conditions (temperature, humidity, pollution levels), industrial data (machine status, output), and more. Your design aims to provide high availability and fast access time to process real-time queries to the sensor data store.   You have 5 data centers, each with an identical replica of the store.  Each data center has 100,000 servers.  The store is partitioned into 10,000 equal shards and stored on 10,000 servers .  Each shard is replicated in 10 servers.  You intend to provide “full harvest” for each query and fully exploit the available parallelism for processing each query.  A server assigned to a query is dedicated to that query for the duration of the query processing.   (c) [2 points] Assume that you perform rolling upgrade whenever there is a hardware upgrade. The rolling upgrade takes down 10% of the servers at a time in each phase. What will be the yield during every upgrade phase? Explain. 

Read Details

When they occur in the atmosphere, aerosols such as dust and…

When they occur in the atmosphere, aerosols such as dust and sulfates tend to decrease environmental temperature. 

Read Details

Internet_Scale_Computing_2c Map Reduce   The context for thi…

Internet_Scale_Computing_2c Map Reduce   The context for this question is the same as the previous question. 2. Consider the following implementation of a MapReduce Application. It operates on a cluster of server nodes with the following execution model: Each worker thread executes its assigned map tasks sequentially (one map task at a time) Intermediate data from each map task is stored on the worker’s local disk Data transfer occurs for reducers to collect the intermediate data from the mapper tasks  No network cost for accessing data on the same server node Network transfer cost applies only between different server nodes All inter-server-node data transfers can occur in parallel A reduce task begins processing only after receiving all its required intermediate data. Each worker thread executes its assigned reduce tasks sequentially (one reduce task at a time) Specifications of the MapReduce Application to be run: Input data: 100GB split into 100 shards of 1GB each Number of map tasks: 100 (one per shard) Number of reduce tasks: 10 (the desired number of outputs from the Map-Reduce Application) Each map task produces 100MB of intermediate data Each reduce task gets equal of amount of intermediate data from each of the map tasks to process for generating the final output Simplifying assumptions: Ignore local disk I/O time All network paths between server nodes have same bandwidth. Parallel network transfers don’t affect each other (no bandwidth contention). All data transfers occur ONLY after ALL the map tasks have completed execution Perfect load balancing (work distributed evenly to all reduce tasks) All server nodes in a given configuration have identical performance Compare two different cluster configurations: Configuration A (High-Performance Server Nodes): 5 server nodes Processing speed: 1 minute per GB (for either map or reduce task) Network transfer speed: 2GB per minute between server nodes Configuration B (Commodity Nodes): 10 server nodes Processing speed: 1.5 minutes per GB (for either map or reduce task) Network transfer speed: 1GB per minute between server nodes   (c) [1 point] Which configuration is faster in this example – configuration A (smaller number of high-performance nodes) or configuration B (larger number of commodity nodes)? 

Read Details

Internet_Scale_Computing_2a Map Reduce   2. Consider the fol…

Internet_Scale_Computing_2a Map Reduce   2. Consider the following implementation of a MapReduce Application. It operates on a cluster of server nodes with the following execution model: Each worker thread executes its assigned map tasks sequentially (one map task at a time) Intermediate data from each map task is stored on the worker’s local disk Data transfer occurs for reducers to collect the intermediate data from the mapper tasks  No network cost for accessing data on the same server node Network transfer cost applies only between different server nodes All inter-server-node data transfers can occur in parallel A reduce task begins processing only after receiving all its required intermediate data. Each worker thread executes its assigned reduce tasks sequentially (one reduce task at a time) Specifications of the MapReduce Application to be run: Input data: 100GB split into 100 shards of 1GB each Number of map tasks: 100 (one per shard) Number of reduce tasks: 10 (the desired number of outputs from the Map-Reduce Application) Each map task produces 100MB of intermediate data Each reduce task gets equal of amount of intermediate data from each of the map tasks to process for generating the final output Simplifying assumptions: Ignore local disk I/O time All network paths between server nodes have same bandwidth. Parallel network transfers don’t affect each other (no bandwidth contention). All data transfers occur ONLY after ALL the map tasks have completed execution Perfect load balancing (work distributed evenly to all reduce tasks) All server nodes in a given configuration have identical performance Compare two different cluster configurations: Configuration A (High-Performance Server Nodes): 5 server nodes Processing speed: 1 minute per GB (for either map or reduce task) Network transfer speed: 2GB per minute between server nodes Configuration B (Commodity Nodes): 10 server nodes Processing speed: 1.5 minutes per GB (for either map or reduce task) Network transfer speed: 1GB per minute between server nodes   (a) [7 points] Calculate the total execution time for this MapReduce Application on configuration A.  You should express the execution time in terms of: i. Time taken by map phase ii. Time taken by communication phase – transfer of intermediate data to the server nodes performing reduce tasks iii. Time taken by reduce phase

Read Details

When they occur in the atmosphere, aerosols such as dust and…

When they occur in the atmosphere, aerosols such as dust and sulfates tend to decrease environmental temperature. 

Read Details

Piece-rate pay, commission pay, and profit sharing are all e…

Piece-rate pay, commission pay, and profit sharing are all examples of

Read Details

In concert with quality management programs, __________blank…

In concert with quality management programs, __________blank controls help monitor the quality of goods or services at each step in the production process to alert managers to problems.

Read Details

__________is a technique used to control behavior.

__________is a technique used to control behavior.

Read Details

__________blank models propose that whether a leader who pos…

__________blank models propose that whether a leader who possesses certain traits or performs certain behaviors is effective depends on the situation or context.

Read Details

Nucleotides contain all of the following except: 

Nucleotides contain all of the following except: 

Read Details

Posts pagination

Newer posts 1 … 37,566 37,567 37,568 37,569 37,570 … 76,831 Older posts

GradePack

  • Privacy Policy
  • Terms of Service
Top