The MapReduce task scheduling mechanism and the HDFS storage strategy in Hadoop are lower in the performance of complex heterogeneous application scenarios, and bave become the botteneck of mass data processing. The addresses based on Hadoop platform design patent image retrieval system which is a concrete, in default of MapReduce task scheduling mechanism and HDFS storage strategy in practical application scene performance is poor, respectively, are proposed optimization measures, improve the performance of the system. The main work is as follows: (1) The design principle and execution flow of Hadoop framework, HDFS and MapReduce are deeply analyzed. (2) Aiming at the problem of the actual use of the Hadoop platform in the actual use of the image patent retrieval system, the improvement of the LASE task scheduling strategy and the HIFI storage strategy are improved respectively. (3) The optimization strategy of the image retrieval for the Hadoop platform is verified by experiments. The experimental results show that the optimized system can improve the performance of the Hadoop platform, and reduce the response time of the users’ request.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.