research paper - improving energy efficiency of bitcoin mining processor

7
Abstract: Bitcoin is a very successful digital currency. Important question that Bitcoin miners need to consider is whether the investment in a new piece of hardware will pay off, versus simply buying the BTC on an exchange. It is embarrassing to buy a bitcoin mining ring and never recoup the original BTC cost in the profits, especially since maintaining the rigs requires round the clock monitoring and considerable energy bills. A simple solution is to evaluate the return of the mining operation in terms of BTC. Here we look at how the Goldstrike 1 miner works to be energy efficient. We also take a look at various cooling techniques that would help reduce heat radiation and in turn reduce energy consumption. A strategic placement of heat detector systems is also needed for large mining farms, that will be discussed as well. Index Terms—Bitcoin, Energy Consumption, Cooling techniques I. INTRODUCTION 1 With paper money, a government decides when to print and distribute money. Bitcoin doesn't have a central government. With Bitcoin, miners use special software to solve math problems and are issued a certain number of bitcoins in exchange. This provides a smart way to issue the currency and also creates an incentive for more people to mine. On Dec. 16, 2014, when PandoDaily published its article, they recorded a global hashrate of 7,000,000 Gigahash per second (Gh/s). That’s a lot of hash. Blockchain.info, when they were still 1 Paper was submitted for review on 6 th November, 2015. Author: Kanishk Agarwal (I001), MPSTME, NMIMS University 3 rd year. Author: Atharv Johri (I014), MPSTME, NMIMS University 3 rd year. Author: Idhanta Kakkar (I015), MPSTME, NMIMS University 3 rd year. listing the statistics, estimated the electricity consumption this required by using a rate of 650 watts per Gh/s. So, with a little multiplication we find that means the Bitcoin network was supposedly drawing 9.55 gigawatts. Multiply that by 24 to discover that Bitcoin was purportedly using 229.2 gigawatt-hours per day. But the actual Blockchain.info stat for energy consumption (which PandoDaily quoted) was 301 gigawatt-hours per day. Think about that for a month and then imagine paying the bill for electricity for just 1 year of mining. Two factors dictate energy costs. Energy cost and the energy consumption of the part. Advantage will be for those that have cheaper access to energy and already have their cost of mining hardware paid off when returns on hashing were higher. Cheaper energy allows them to pay to their newly acquired hardware over longer cycles and Improving Energy Efficiency Of Bitcoin Mining Processor NMIMS Mukesh Patel School of Technology Management and Engineering Kanishk Agarwal – [email protected] Atharv Johri – [email protected] Idhanta Kakkar – [email protected] 1

Upload: idhanta-kakkar

Post on 15-Feb-2017

184 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: Research paper - Improving Energy Efficiency Of Bitcoin Mining Processor

Abstract: Bitcoin is a very successful digital currency. Important question that Bitcoin miners need to consider is whether the investment in a new piece of hardware will pay off, versus simply buying the BTC on an exchange. It is embarrassing to buy a bitcoin mining ring and never recoup the original BTC cost in the profits, especially since maintaining the rigs requires round the clock monitoring and considerable energy bills. A simple solution is to evaluate the return of the mining operation in terms of BTC. Here we look at how the Goldstrike 1 miner works to be energy efficient. We also take a look at various cooling techniques that would help reduce heat radiation and in turn reduce energy consumption. A strategic placement of heat detector systems is also needed for large mining farms, that will be discussed as well.

Index Terms—Bitcoin, Energy Consumption, Cooling techniques

I. INTRODUCTION1

With paper money, a government decides when to print and distribute money. Bitcoin doesn't have a central government. With Bitcoin, miners use special software to solve math problems and are issued a certain number of bitcoins in exchange. This provides a smart way to issue the currency and also creates an incentive for more people to mine. On Dec. 16, 2014, when PandoDaily published its article, they recorded a global hashrate of 7,000,000 Gigahash per second (Gh/s). That’s a lot of hash. Blockchain.info, when they were still listing the statistics, estimated the electricity consumption this required by using a rate of 650 watts per Gh/s. So, with a little multiplication we find that means the Bitcoin network was supposedly drawing 9.55 gigawatts.

Multiply that by 24 to discover that Bitcoin was purportedly using 229.2 gigawatt-hours per day.

But the actual Blockchain.info stat for energy consumption (which PandoDaily quoted) was 301 gigawatt-hours per day. Think about that for a month and then imagine paying the bill for electricity for just 1 year of mining.Two factors dictate energy costs. Energy cost and the energy consumption of the part. Advantage will be for those that have cheaper access to energy and already have their cost of mining hardware paid off when returns on hashing were higher. Cheaper energy allows them to pay to their newly acquired hardware over longer cycles and to continue operation even when $ per Gh/s drops very low.

1 Paper was submitted for review on 6th November, 2015.Author: Kanishk Agarwal (I001), MPSTME, NMIMS University 3rd

year.Author: Atharv Johri (I014), MPSTME, NMIMS University 3rd year.Author: Idhanta Kakkar (I015), MPSTME, NMIMS University 3rd year.

An advantage for others may be because they have more energy efficient designs.In the case of Bitcoin, the profitability is a direct function of the silicon, with few other factors except access to electricity and cooling.

II.TECHNIQUES

A. Energy FactorAn important question that Bitcoin miners need to consider is whether the investment of USD in a new piece of hardware will pay off, versus simply buying the BTC on an exchange. Many custom BTC mining rigs (or shares in companies that maintain them on your behalf) are denominated in BTC4, so it's embarrassing to buy such a rig and never recoup the original BTC cost in its mining profits, especially since maintaining the rigs requires round-the-clock monitoring and considerable energy bills. A simple solution is to evaluate the return of the mining operation in terms of BTC. Of course, there are two factors that dictate energy costs - the cost of energy, and the energy consumption of the part. The parties with the greatest advantage will be those that have cheaper access to large quantities of energy and already have their mining hardware paid off when returns on hashing were higher. Cheaper energy allows these parties to pay to their newly acquired hardware over longer cycles, and to continue to operate even when $ per Gh/s drops precipitously low. Others may have an advantage because they have more energy efficient hardware designs.

Improving Energy Efficiency Of Bitcoin Mining ProcessorNMIMS Mukesh Patel School of Technology Management and Engineering

Kanishk Agarwal – [email protected]

Atharv Johri – [email protected]

Idhanta Kakkar – [email protected]

1

Page 2: Research paper - Improving Energy Efficiency Of Bitcoin Mining Processor

B. Goldstrike 1 Miner

The chip consists of 120 hash engines, and at a target clock frequency of 1.05GHz, delivers 126GH/s. Each hash engine solves a different problem – so at any given moment there are 120 different problems being worked on. There is a 128 entry deep and 384-bits wide input FIFO and a 128 entry deep 65-bits wide output FIFO. A SPI controller writes the data into the input FIFO and reads the data out of the output FIFO. Every time a new data is shifted into the input FIFO, output data from the output FIFO is shifted out. A PLL is used to synthesize core clock from a 50MHz crystal input. A thermal diode is used to monitor the die temperature and adjust supply voltage and

clock frequency to keep the die temperature within safe limits at the system level. The input interface loads a new work from the input FIFO into the working registers of the hash engine, when the previouslyloaded work is completed. Each hash engine performs two rounds of SHA-256 processing. SHA-256 algorithm itself consists of 64 iterations. Both rounds and iterations are fully unrolled, resulting in a 128-stages deep pipelined architecture.Hence at any given instant of time, there are 128 different nonce values being processed in the pipeline.After an initial latency of 128 cycles, a hash output is produced every clock cycle and is checked by acomparator logic to test if the “target” criteria is met. If the target is met, it is handed over to the outputinterface to place the result in the output FIFO.

Fabrication & Packaging

This design was fabricated on Global Foundries 28nm HKMG process with 9 metal layers. A die picturewith an 11 x 11 grid layout implementation is shown in Figure 6.a. The center grid box contains theentire top level logic. Each of the remaining 120 grey squares in the grid represents a single hash engine.There are routing channels between the hash engine blocks to route the input and output buses andother control signals to the cores. All the signal I/O ports are located on the right edge of the die.To reduce our overall system cost and meet power delivery and thermal dissipation requirements, wedecided to put 4 bare dies in a 37.5 mm x 37.5mm FCBGA package.Each die candeliver 126GH/s @ 1.05GHz and 0.7V, consuming approximately 125W, hence the four die packagedelivers the desired 504 GH/s at 500WBased on this power and performance, this design has anenergy efficiency of 1GH/J.

System Design

Terraminer™ IV appliance consists of a master control processor (MCP) and two Goldstrike™ boards,

with two Goldstrike™ packages per board. The MCP is connected to the bitcoin network via its Ethernetconnection and run the bitcoin protocol software.The MCP communicates with the Goldstrike™ boards via USB connection. The Goldstrike™ boardconverts USB to the SPI and I2C protocols for communication with the Goldstrike™ ASICs and otherdevices such as remote temperature sensing, fan speed and other control functions.

Design Challenges

Some of the interesting design challenges we faced include high node toggle rates, highpower density and high sequential to combinational ratio.The very nature of SHA-256 algorithm creates random bit patterns resulting in toggle rates exceeding80% on data path nodes in the hash engine. Compact physical layout for reduced die area combinedwith high toggle rates produced very high power density for this design.A fully unrolled SHA-256 algorithm implementation resulted in a design with sequential cell countexceeding 50% of total. Distributing clock to all the sequential cells and closing timing was definitely abig challenge and it took several iterations.Ensuring that the die junction temperature stayed within the 105oC design specification in a systemdissipating excess of 2.4KW was another challenge.

Efficiency of Goldstrike

Each die can deliver 126GH/s @ 1.05GHz and 0.7V, consuming approximately 125W, hence the four die package delivers the desired 504 GH/s at 500W.Based on this power and performance, this design has an energy efficiency of 1GH/J. This can be bettered to achieve even higher efficiency.

Now that we know how the basic miner works, imagine a full room with these, they will generate a lot of heat and will need efficient heat detection systems and proper cooling techniques. Further lets see how the heat detection system works and how we can use it to detect which source is emitting the maximum amount of heat.

C. Strategic Sensor PlacementStudies have shown that a significant portion of the total

energy consumption of many data centers is caused by the inefficient operation of their cooling systems. Without effective thermal monitoring with accurate location information, the cooling systems often use unnecessarily low temperature set points to overcool the entire room, resulting in excessive energy consumption.

CRAC – Computer Room Air Conditioning accounts for up to half of the energy consumption. Different techniques

Page 3: Research paper - Improving Energy Efficiency Of Bitcoin Mining Processor

are needed to cool the place, eg, HP engineers used an ingenious way of vent-tile technology.

Current solution is simplistic placement, which based off of assumptions, helps to locate where the sensors should be placed to immediately detect from where the heat is generated.

But a better solution would be using mathematics to actually calculate where the sensors should be placed.

Here we use a technique based on CFD – Computation Fluid Dynamics. Using CFD we try to solve 2 problems:

1. When the number of sensors is given, we seek to place all the given sensors in the data center.

2. Considering the still high cost of each wireless temperature sensor, we seek to minimize the number of sensors needed.

Hot Server Detection Model

Data Fusion technique is used to improve the detection performance of sensors. R - Fusion RadiusUse a simple data fusion scheme to calculate the average temperature from all the sensors with the fusion radius of the monitored spot.Compare that value with ηIf the result is larger than η then detection is positive.Use normal distribution to figure out noise.Tm(x; y; z) = Tr(x; y; z) + Ni

2

Tr being real temperature at that location without noise.

Detection Probability Maximization

By using Detection Probability Maximization, we can pinpoint from where the maximum heat is coming by increasing the probability of the sensor to detect where the heat is most and report that instead of reporting where the heat generated is lower than critical temperature.

M - number of locations in data center room where temp needs to be monitored.Given a limited and reasonable number of sensors, N < M, we need to find the placement of these N sensors such that we can detect the overheating emergency at any of the M locations with the highest possible confidence.PF - False alarm rate of reporting an overheating emergencyPD - the detection probability of an overheating emergencyN - Sensor number within the fusion region

1/M ΣPD

PF < ɑ (detection false alarm rate)

n can be calculated:

n= σXn^-1(1- ɑ)/n + C

C – constant

Sensor Number Minimisation

Another problem in thermal monitoring is minimising the number of placed sensors.Cost of each sensor is high plus installation cost of many together is high too.To achieve the goal of sensor number minimisation.Change the section probability objective as a constraint. Assume M locations in data center room whose temperatures need to be monitored, we beed to minimise sensor number N that can oiled a detection probability higher than β and false alarm rate lower than ɑ

Arg min N

Subject to the constraints:

PF(SN) < ɑ 1<i<MPD(SN) > β 1<i<MSN - List of Locations of all the N sensors

We have looked in-depth on how we can do better sensor placement to detect the heat generated. Now let us look at the cooling techniques that can be used to minimize and mitigate the heat so that energy consumption is less and also, damages don’t occur.

III. COOLING TECHNIQUES

Different cooling techniques are needed in order to find the most suitable one that would fit our requirement, here are some:

1. Immersion Cooling

It is used to cool high heat flux components. Unlike the water-cooled cold plate approaches which utilize physical walls to separate the coolant from the chips, immersion cooling brings the coolant in direct physical contact with the chips. As a result, most of the contributors to internal thermal resistance are eliminated, except for the thermal conduction resistance from the device junctions to the surface of the chip in contact with the liquid.Direct liquid immersion cooling offers a high heat transfer coefficient which reduces the temperature rise of the heated chip surface above the liquid coolant temperature. The magnitude of the heat transfer coefficient depends upon the thermo physical properties of the coolant and the mode of convective heat transfer employed.

2. Air-Cooled Systems

3

Page 4: Research paper - Improving Energy Efficiency Of Bitcoin Mining Processor

Forced air-cooled systems may be further subdivided into serial and parallel flow systems. In a serial flow system the same air stream passes over successive rows of modules or boards, so that each row is cooled by air that has been preheated by the previous row. Depending on the power dissipated and the air flow rate, serial air flow can result in a substantial air temperature rise across the machine. The rise in cooling air temperature is directly reflected in increased circuit operating temperatures. This effect may be reduced by increasing the air flow rate. Parallel air flow systems have been used to reduce the temperature rise in the cooling air.

3. Hybrid Air–Water Cooling

An air-to-liquid hybrid cooling system offers a method to manage cooling air temperature in a system without resorting to a parallel configuration and higher air flow rates. In a system of this type, a water-cooled heat exchanger is placed in the heated air stream to extract heat and reduce the air temperature. Thus the hybrid air-water cooling technique is the most suited method here and can be used.

IV. DYNAMIC CACHE RESOURCE POOLING

One way to improve the energy efficiency of the bitcoin mining processor is change the structure of the processor. It should have the dynamic cache pooling in 3d stacked multicore processor for improving the energy efficiency.In resource pooling multiple components are shared among different cores and it also reduces the total chip area.Since in the 3d stacked processors the cache is stacked vertically with the help of tsvs helping in efficient pooling of the resources.There are three basic things we need to keep in mind. First, 3d stacked architecture with cache pooling, Application-aware job allocation and the evaluation of dynamic cache resource pooling.According to the proposed structure the l2 caches are stacked one above theanother using the silicon compound, TSV.The objective of this design is to increase the performance by increasing the cache size whenever need and to save power by turning off the unused cache partitions.Also to implement the cache resource pooling we need to modify cache status registers and cache control logic.LCSR is used to record the status of the cache partitions and Remote Cache Status Registers to keep l1 cache aware of the processes.According to the runtime application-aware job allocation and cache pooling policy the cache hungry jobs should be stacked with less cache hungry jobs so as to efficiently share the pool of cache.This policy contains two stages:1. Job Allocation and 2.Cache resource pooling.Job Allocation

First the jobs are allocated to caches randomly and within few seconds using a pre defined regression based predictor the jobs are sorted on the basis of their performance improvements .Among four cache Jobs(j1,j2,j3,j4)where j1 the most cache hungry and j4 least cache hungry, so j1 and j4 are grouped together and j2 and j3 are grouped together.. Also j2 and j3 are placed near the heat sink so they can give away heat fast making it more efficient.Cache Resource PoolingIt is a method to manage job pair. A threshold (t) is used to determine whether a job nees more cache partition. This threshold represent the minimum improvement that resuls in alower EDP.Also the results states that the EDP and EDAP went up by 39.2% and 57.2% in comparison with the#d systems with static cache sizes.

V. CONCLUSION

As the difficulty level for Bitcoin mining continues to increase, there is a continued effort to build new Bitcoin mining processor with reduced power consumption and increased performance. Apart from migrating the design to the latest process technologies, there are still some opportunities to improve performance and reduce power through customizing the micro architecture and circuit design. Goldstrike™ 1 processor has energy efficiency of 1GH/J and that this can be increased to greater than 4GH/J.

Ultimately, for innovation in the hardware space, we need lots of new ideas to be tried out for cheap. However, the semiconductor model has increasingly moved away from this direction to expensive chips. As a result, chip startups are largely non-existant and there are few markets in which high- risk, innovative ideas can be examined. At the same time, demand for hardware engineers is dropping and fresh hard-ware talent is being diverted away from hardware companies to software companies that offer higher salaries. This creates an increasingly unhealthy death spiral where fewer new ideas are being tried and the top talent is leaving the field. We need to think about strategic ways to enable cheaper chips for new ideas, through open-source CAD tools, for instance, or new technologies to reduce chip costs, or more fluid financing methods that spread risk better, and through better education and training, in order to enter the Age of Bespoke Silicon.

Future Scope The techniques mentioned here in the paper can be

applied to data mining farms to improve the energy efficiency of the processors. The Processor’s architecture and cache can further be modified to solve the heating problem to reduce the energy footprint. Research can be done on cooling liquids which can be used instead of

Page 5: Research paper - Improving Energy Efficiency Of Bitcoin Mining Processor

dielectrics in immersion cooling to provide a better heat sink.

REFERENCES

1. GOLDSTRIKE™1: COINTERRA’S FIRST GENERATION CRYPTO-CURRENCY MINING PROCESSOR FOR BITCOINJAVED BARKATULLAH, TIMO HANKE

2. BITCOIN MINING AND ITS ENERGY FOOTPRINT KARL J. O'DWYERT AND DAVID MALONE* HAMILTON INSTITUTE NATIONAL UNIVERSITY OF IRELAND MAYNOOTH

3. BITCOIN AND THE AGE OF BESPOKE SILICON MICHAEL BEDFORD TAYLOR UNIVERSITY OF CALIFORNIA, SAN DIEGO

4. DYNAMIC CACHE POOLING FOR IMPROVING ENERGY EFFICIENCY IN 3D STACKED MULTICORE PROCESSORS JIE MENG, TIANSHENG ZHANG, AND AYSE K. COSKUN ELECTRICAL AND COMPUTER ENGINEERING DEPARTMENT, BOSTON UNIVERSITY, BOSTON, MA, USA

5. INTELLIGENT SENSOR PLACEMENT FOR HOT SERVER DETECTION IN DATA CENTERS XIAODONG WANG, XIAORUI WANG, MEMBER, IEEE, GUOLIANG XING, MEMBER, IEEE, JINZHU CHEN, CHENG-XIAN LIN, AND YIXIN CHEN, SENIOR MEMBER, IEEE

5