Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Final Year Project Proposal - 38488566

Distributed Compute Placement

Abstract

The project aims to research and develop effective algorithms and methods for distributed computer placement in cloud computing and distributed systems. The primary objective is to enhance the seamless interaction between clients and servers, taking into account factors such as latency, service quality, workload, cost-efficiency, and energy consumption.

This project will be divided into six major orientations,

-     Research on existing solutions

-     Create an architecture of basic server-client model

-     Identify use cases based on different scenarios

-     Implementing machine learning algorithms to allocate resources

-     Cost optimisation

-     Result.

In the ultimate goal, this project is aimed at finding new algorithms based on AI-powered auto scaling algorithms, meanwhile considering minimising the cost and energy efficiency.

1. Introduction

In recent years, the proliferation of cloud computing and the growth of distributed systems have highlighted the importance of efficient resource allocation. Traditional methods often fall short of meeting the dynamic demands of modern applications. Consequently, research into distributed computer placement has gained significance. Existing studies have focused onload balancing, task scheduling, and resource allocation algorithms, but there is still room for improvement and innovation.

This project builds upon existing research while exploring new avenues to enhance distributed computer placement. In another word, the project embarks on a journey to delve into the intricacies of decision-making algorithms designed to strategically place offloadable compute tasks within the distributed system's architecture.

By understanding the complexity of self-scaling server, it barely has considered the cost and energy consuming meanwhile maintaining the quality of service.

This project is aimed to find an alternative algorithm using machine learning and artificial intelligent to deploy a self-distributing server system, considering cost and energy efficiency as our major researching target, in order tooptimise client experience as well as server database workload.

2. Background

1. Escalating Data Cеntеr Costs in thе UK:

Data cеntеrs in thе Unitеd Kingdom havе еxpеriеncеda stееprisе in opеrational costs.

According to a rеport by thе UK Energy Research Centre, thе еnеrgy consumption of data cеntеrs within thе country has sееnan annual incrеasе of 2.2% in rеcеntyеars, contributing  significantly to thе national еlеctricity consumption [1]. Thеsе trеnds еmphasizе thе growing concеrn rеgarding thе rising costs associatеdwith powеring and maintaining data cеntеrs in  thе UK.

2. Sеrvеr Inеfficiеnciеsand Enеrgy Wastе:

Existing sеlf-distributing sеrvеr placеmеnt systеmsoftеn prioritizе pеrformancе ovеr еfficiеncy, rеsulting in sеrvеr ovеrprovisioning andrеsourcе undеrutilization. A study by thе National Grid, thе UK's еlеctricity systеmopеrator, has shown that many sеrvеrs within UK data cеntеrsopеratе at only 12% of thеir capacity, highlighting thе considеrablе inеfficiеnciеsin rеsourcе managеmеnt and еnеrgy consumption [2]. Such inеfficiеnciеs rеprеsеnt a significant financial burdеn and еnvironmеntalchallеngе .

3. Environmеntal Impact in thе UK:

Thе еnvironmеntal implications of inеfficiеnt data cеntеrs in thе UK arе incrеasingly concеrning. A rеport by thе Carbon Trust, a lеading organization in sustainability and climatе changе, rеvеalеd that data cеntеrs in thе UK accountеd for 2% of thе country's total еlеctricity consumption and wеrе rеsponsiblе for 2.5 million mеtric tons of carbon еmissions in 2020 [3]. This еnvironmеntal impact undеrscorеs thе urgеncy of dеvеloping morе sustainablе computing solutions.

In light of thеsе challеngеs, this proposal aims todеsign and implеmеnt asеlf-distributing computе placеmеnt systеm that addrеssеs thе shortcomings of еxisting solutions, with a primary focus on optimising cost and еnеrgy consumption. By lеvеraging advancеd algorithms and data-drivеn insights, this projectsееks to crеatе a morе rеsponsiblе and еfficiеnt computing infrastructurе that not only mееts thе dеmands of modеrn computing but also aligns with our еnvironmеntaland financial goals.

The project extends an invitation to explore alternative applications, broadening the horizon of research possibilities. Taking two method implementations including Machine Learning-Based Resource Allocation including AI-Powered Auto-Scaling, as well as considering cost and environmental optimisation.

3. The Proposed Project

3.1 Aims and Objectives

Develop an algorithm that dynamically allocates virtual machines to different physical servers based on current workloads and resource availability, adopting with the consideration of cost and energy consumption, this project involved:

o Understanding and learn the concept of distributed computing (stated as the first stage at  3.2)

o Exploring the existed limitations of self-distributing data centre

o Understanding and utilising machine learning

o Deploy an algorithm that improve existing self-scalable compute server regarding to cost and efficiency consumption

o To prove the project has improved compared to the existed solution.

3.2 Methodology

In order to comprehend the pattern between the latency and data transfer information between clients and servers, first stage is to test and evaluate using a basic model:

o One server one client, same location

o One server 5 client, same location and different location

o 2 server 5 clients, different location etcetera

Create logging and implement machine learning algorithm:

The first thought is when detect overload on server, report client IP location, determine client’s locality and create a new server automatically in the new location based on the logged list. In order to provide a better experience to clients regarding  to latency and the risk of packet lost, due to overload of the existing server.

Then taking consideration of the cost and energy efficiency, the algorithm will fulfil the following criteria:

Server Consolidation: Consolidate workload onto as few servers as possible. Fewer servers typically mean lower hardware, maintenance, and energy costs. This strategy may involve optimising resource allocation and using server virtualisation techniques, meanwhile removing or relocating least used servers.

Resource Rightsizing: Ensure that the resources (CPU, memory, storage) allocated to each server are appropriately sized for the workload. Avoid over-provisioning, as it can lead to wasted resources and higher costs. Consider dynamicallocation of resources.

Optimise Network Costs: Minimise data transfer costs between servers. Keep data locality in mind when distributing tasks, and consider using content delivery networks (CDNs) or edge computing(on the client side) to reduce the need for data transfers over long distances or data size.

Depending on the development of the research, methodology might be refined compared to initial points which may create differences from initial points.

4. Programme of Work

-     Research on existing solutions wk 3-4

-     Find a new solution that can improve the quality, latency of data transmit between  server and client based on existing solution meanwhile considering cost and energy efficiency. Wk 4-5

-     Identify use cases wk 5-6

-     Design a system architecture and test the problem without algorithm, locate problems between a basic server and client model wk 6-8

-     Implement prototype (testing) wk 13- 15

-     Test and evaluate wk 15- 16

-     Optimise and refine wk 16- 17

-     Documentation and reporting wk 17- 19

5. Resources Required

Virtual machines for allocating physical servers and testing, as well as creating clients in different locations. Also, database is required for physical servers. Might consider using VPN to virtualise server’s location, considering extra latency generated by private VPN providers, I will create my own VPN server or using existing VPN providers. This project also considers using astable and fast internet connection in order to reduce faults on servers.

6. References

1.   UK Data Centres – Carbon Neutral by 2030? (UKERC, 2020)

https://ukerc.ac.uk/news/uk-data-centres-carbon-neutral-by-2030/

Accessed 10 Sep 2020

2.   Data Centres (NationalGridESO, 2022)

https://www.nationalgrideso.com/document/246446/download

Accessed March 2022

3.   Europe’stop data centre hubs produce 1.2 million homes-worth of emissions

(Digitalisation World, 2023)

https://digitalisationworld.com/news/66334/europes-top-data-centre-hubs- produce-12-million-homes-worth-of-emissions

Accessed 25 Oct 2023