[CSEE Talk] talk: From Terabyte-sized Stem Cell Images to Knowledge, 10am Mon 3/10

Tim Finin finin at cs.umbc.edu
Tue Feb 18 08:59:36 EST 2014


           From Terabyte-Sized Stem Cell Images to Knowledge

                           Peter Bajcsy, PhD
                   Information technology Laboratory
             National Institute of Standards and Technology

                     10:00am 10 March 2014, ITE 346

This talk will present the computational challenges and approaches to
knowledge discovery from terabyte-sized images. The motivation comes
from experimental systems for imaging and analyzing human pluripotent
stem cell cultures at the spatial and temporal coverage of colonies
that lead to terabyte-sized image data. The objective of such an
unprecedented cell study is to characterize pluripotency of stem cell
colonies over time at high statistical significance in order to
understand the stem cell culture quality parameters and guide a
repeatable growth of high quality stem cell colonies. The terabyte-
sized images represented a stem cell line that was engineered to
produce green fluorescent protein (GFP) under the influence of Oct4
promoter and then imaged in a mosaic of contiguous frames covering
approximately 180 square millimeters, over five days under both phase
contrast and GFP channels.

We overview multiple computer and computational science problems
related to correcting (flat-field, dark current and background),
stitching, segmenting, tracking, re-projecting and then representing
large images for interactive visualization and sampling in a web
browser. We researched extensions to Amdahl's law for Map-Reduce
computations, established benchmarks for image processing on a Hadoop
platform, and introduced cluster node utilization coefficients for
modeling memory demanding computations running on a computer
cluster/cloud. The theoretical aspects of algorithmic complexity and
cluster utilization at terabyte scale are extended to the experimental
aspects of efficient image representation and client-server workload
distribution in the context of visualization interactivity and image
sampling. We report such experimental results for the NIST extensions
to the Deep Zoom paradigm. The presentation will conclude with
illustrations of enabled stem cell discoveries and collaboration
opportunities to create a reference resource not only for cell
biologists but also for computer scientists focusing on terabyte scale
image analyses.

Peter Bajcsy received his Ph.D. in Electrical and Computer Engineering
in 1997 from the University of Illinois at Urbana-Champaign and a
M.S. in Electrical and Computer Engineering in 1994 from the
University of Pennsylvania.  He worked for machine vision, government
contracting, and research and educational institutions before joining
the National Institute of Standards and Technology (NIST) in 2011. At
NIST, he has been leading a project focusing on the application of
computational science in biological metrology, and specifically stem
cell characterization at very large scales. Peter's area of research
is large-scale image-based analyses and syntheses using mathematical,
statistical and computational models while leveraging computer science
fields such as image processing, machine learning, computer vision,
and pattern recognition. He has co-authored more than more than 24
journal papers and eight books or book chapters, and close to 100
conference papers.

Host: Yelena Yesha (yeyesha at umbc.edu)

     -- more information and directions: http://bit.ly/UMBCtalks --


More information about the CSEE-colloquium-out mailing list