Rapid advances in field robotics, unmanned aerial systems (UAS), sensor and satellite technology, and computing power have facilitated exponential growth in remote sensing data and its applications. In this work, we present a concept of UAV and satellite spatio-temporal data fusion for crop monitoring, specifically plant phenotyping and yield prediction. Low-cost sensors integrated on a UAV were used to collect RGB, multispectral, and thermal images during the growing season over multiple test sites in the US. PlanetScope and WorldView-3 multispectral data were combined with UAV data following rigorous image pre-processing including pan-sharpening, atmospheric correction, and reflectance retrieval. UAV thermal and multispectral data were calibrated to canopy temperature and reflectance. Latest machine learning, specifically, deep learning, were used to estimate plant traits including chlorophyll concentration, biomass, and yield. Our results show that (1) spatial-temporal data fusion from airborne and satellite systems provide effective means for capturing early stress; (2) UAV data can complement the limitations of satellite remote sensing data for field-level crop monitoring, addressing not just mixed pixel issues but also filling the temporal gap in satellite data availability; and (3) spatial-temporal gap filling enables predicting yield more accurately using data collected at optimal growth stages (e.g., seed filling stage). The concept developed in this paper also provides a framework for accurate and robust estimation of plant traits and grain yield and delivers valuable insight for high spatial precision in high-throughput phenotyping and farm field management.