Problem statement: Understanding problem statement is one of the key to get good Machine learning model. Domain knowledge makes crucial role in understanding problem statement.Defining problem statement is not easy. All the examples or problems available over the web are clearly defined their problem statement. But when you start working on real time use cases, its very hard to understand the problem. Data scientist and Data Analyst will play major in this.
Data: Now a days we are having lot of data in the form of text, images, audio and video format. But the problem with this data is, all this data is unstructured format and not clean.All the sample data available over web (Kaggle for example) is already cleaned data. For practicing or for learning this will help. But when you started working on real time projects, you wont get cleaned data. Data Engineer will play major role in cleaning unstructured data, which is commonly known as pre-training stage.
Understanding Data: Even though cleaned data is available, To get better training model, need to understand data samples clearly. How data is a distributed and what features needs to take from that data.If the data is not distributed equally, ML model will not work properly. Always make sure that input data is equally distributed. Data Analyst will play major role in this.
Happy Learning!!
1 comment:
I will really appreciate the writer's choice for choosing this excellent article appropriate to my matter.Here is deep description about the article matter which helped me more. A11 Pro ETH
Post a Comment