Abstract:In view of the massive water observation data generated by the dual water cycle of nature and human society, the existing water data management system has the problems of large storage load, difficult database expansion and slow query speed, which cannot meet the needs of storage and analysis. To solve the problems, firstly, the basic architecture of distributed big data storage platform is designed by combining the popular virtualization technology and hadoop infrastructure. Secondly, the design of distributed big data storage platform is realized according to the existing big data of water utilities and the actual business database table. Finally, the data migration code from the centralized platform to the distributed platform is completed, and the data migration experiment is carried out. The experimental results verify the feasibility and effectiveness of the design scheme of the distributed big data storage platform, which can provide an ideal distributed solution for the storage and processing of large-scale industrial data.