Sunday Short Courses
Additional fees apply. Visit https://www.slas2020.org/program/short-course-program/ for registration fees.
This one-day, hands-on course will introduce the use of the open source software KNIME for data analysis and data visualization. KNIME does not require any programming skills as the user interacts with data by building in graphical interface pipelines of nodes, each executing a separate step. This allows the user to interact with data in a much more natural way and facilitates the analytical process. The graphical interface also allows users to easily explore their data and test different analytical methods and allows scientists with no programming skill to apply advanced statistical methods to their data. For data scientists and bioinformaticians who are used to exploring their data with scripts, they will find that KNIME allows integrating several programming languages. For a coder, it is very easy and convenient to add a graphical user interface to their scripts to be displayed in the KNIME interface. Data scientists can therefore continue using their favorite code in their favorite language but can let users easily parametrize the algorithms on their own.. Data scientists can therefore concentrate on solving new problems instead of solely executing code.
The course comprises two parts. In the morning, students will be taught the basics of KNIME, how to load data, select rows and columns, aggregate data and visualization techniques. In the afternoon, an introduction into multi-variate analysis will be given. Attendees will be taught how to calculate various correlations, normalize data, build cluster data and use machine learning to analyze biological data.Attendees are strongly encouraged to bring their own data so they can start their own analysis pipeline.
Who Should Attend
• Biologists creating quantitative data who want to analyze and visualize their data.
• Data scientists andbioinformaticians who want to share their analytical pipelines with collaborators and who are not comfortable working with scripts and command line interfaces.
No programming skills required, but an understanding of the concept of a 384 well experiment is helpful.
• Attendees will be able to analyze, visualize and summarize their data in ways that would normally require acknowledge of a programming language.
• Data scientists and bioinformatician will gain a new tool for sharing their analytical pipelines without having to be responsible for executing code each time new data is generated.
• KNIME basics:
• Installing KNIME with extensions and programming languages R and Python
• Data wrangling: importing data, selecting rows and columns, annotating, join dat fields andconcatenate data fields
• Data analysis: data summary (mean, standard deviations, normality, etc.), parameter correlation, parameter selection, normalization
• Data visualization: bar plots, box plots, plate heatmaps, phenotypic fingerprint plots
• Data mining:
• Unsupervised clustering: distance measures (Euclidian, cosine, etc), various clustering algorithm (hierarchical, k-means etc.)
• Supervised clustering: various machine learning algorithms (decision trees, random forrest, regression, Deep Learning, etc.), algorithm training (training set selection, cross validation, etc.), model interpretation