SYNTHETIC APERTURE RADAR IN AGRICUTLURE WITH AI-ENHANCED TECHNIQUES FOR CROP CLASSIFICATION, CROP MONITORING, AND YIELD PREDICTION By Mahya Sadat Ghazi Zadeh Hashemi A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Civil Engineering – Doctor of Philosophy 2024 ABSTRACT This PhD research advances the application of high-resolution Synthetic Aperture Radar (SAR) and other satellite remote sensing technologies in agriculture, particularly focusing on crop classification, crop monitoring, and yield prediction. The study addresses critical challenges in effectively leveraging vast spatiotemporal data by integrating SAR data with deep learning, machine learning, and time-series analysis techniques to estimate crop attributes, crop biophysical parameters, and crop yield with improved accuracy. A novel contribution of this research is the development of self-supervised learning foundation models and the fusion of SAR and optical data to enhance predictions of crop yield, Vegetation Water Content (VWC), and crop height. The research also investigates the integration of dynamic SAR-based planting dates into crop models, improving yield estimation in rainfed paddy fields in Cambodia. The findings reveal that SAR-derived planting dates significantly enhance yield predictions by reducing uncertainty and improving accuracy compared to traditional methods. Spanning diverse climatic zones and management practices, this research demonstrates the exceptional potential of VH channel of Sentinel-1 SAR data for near-accurate yield prediction across different crops, including Michigan’s non-irrigated corn, soybean, and winter wheat. The study also highlights the effectiveness of patch-based 3D Convolutional Neural Networks (3D- CNNs) and XGBoost in yield estimation, particularly in scenarios with limited reference data. In addition, this dissertation introduces a novel approach for estimating VWC and crop height using geospatial foundation models, demonstrating superior accuracy and generalizability across varied agricultural landscapes. The integration of SAR, optical indices, and climatic data significantly improved the reliability of VWC and crop height estimations, with NDVI, NDWI, VH backscatter, and precipitation emerging as key drivers. The research underscores the need for continued innovation in remote sensing technologies, offering new insights for precision agriculture and supporting sustainable farming practices. Keywords: Synthetic Aperture Radar (SAR), Foundation models, Vegetation Water Content, Deep learning. ACKNOWLEDGEMENTS First and foremost, I would like to express my deepest gratitude to everyone who has supported me throughout the course of my PhD journey. This dissertation would not have been possible without the encouragement and assistance of many individuals. I am profoundly grateful to my advisor, Dr. Narendra Das, for their continuous guidance, support, and insightful feedback. His expertise was invaluable in shaping this dissertation. I also wish to thank my committee members, Dr. Pang-Ning Tan, Dr. Hamed Alemohammad and Dr. Mantha Phanikumar, for their thoughtful contributions and challenging questions that greatly enhanced the quality of this research. I am thankful to NASA Grant #80NSSC21K0403 for providing the financial support that made this research possible. I also want to acknowledge the Civil Engineering and Biosystem and Agricultural Engineering at Michigan State University for providing the resources and facilities required for this research. I would like to extend my deepest gratitude to Dr. Brook Wilke and the Kellogg Biological Station for their invaluable support during my field campaign. I am also grateful for the support provided by the USDA Long-Term Agroecosystem Research (LTAR) Program and the NSF Long- Term Ecological Research Program (DEB2224712) at the Kellogg Biological Station. Additionally, I would like to acknowledge the funding and resources provided by Michigan State University AgBioResearch, which were instrumental in the completion of this research. To my family, especially my husband Ehsan and my lovely son, Aiden, thank you for your unconditional love, encouragement, and for believing in me every step of the way. Finally, I am grateful to all those who have been part of this journey, whether mentioned here or not. Your contributions, in whatever form, have been integral to the completion of this work iv TABLE OF CONTENTS CHAPTER 1: INTRODUCTION AND LITERATURE REVIEW ............................................... 1 1.1. Introduction ......................................................................................................................... 2 1.2. SAR in Agricultural Applications ....................................................................................... 3 1.3. Use of Deep Learning in Agricultural Applications of SAR ............................................ 25 1.4. Challenges ......................................................................................................................... 53 1.5. Opportunities..................................................................................................................... 63 1.6. Conclusion ........................................................................................................................ 77 REFERENCES ......................................................................................................................... 79 CHAPTER 2: IMPACT OF SAR PLANTING DATE ON CROP MODEL YIELD ESTIMATION ........................................................................................................................... 106 2.1. Introduction ..................................................................................................................... 107 2.2. Study area, Datasets, and Tools ...................................................................................... 110 2.3. Methodology ................................................................................................................... 113 2.4. Results ............................................................................................................................. 121 2.5. Discussion ....................................................................................................................... 133 2.6. Conclusion ...................................................................................................................... 136 REFERENCES ....................................................................................................................... 137 APPENDIX ............................................................................................................................ 141 CHAPTER 3: YIELD ESTIMATION FROM SAR DATA USING PATCH-BASED DEEP LEARNING AND MACHINE LEARNING TECHNIQUES ................................................... 142 3.1. Introduction ..................................................................................................................... 143 3.2. Study Area and Material ................................................................................................. 148 3.3. Patch-Based Regression Methodology for Yield Estimation ......................................... 152 3.4. Experiments and Results ................................................................................................. 158 3.5. Discussion ....................................................................................................................... 172 3.6. Conclusion ...................................................................................................................... 176 REFERENCES ....................................................................................................................... 179 APPENDIX ............................................................................................................................ 184 CHAPTER 4: ESTIMATING CROP BIOPHYSICAL PARAMETERS USING SELF- SUPERVISED LEARNING WITH FOUNDATION MODELS AND SAR-OPTICAL OBSERVATIONS ...................................................................................................................... 186 4.1. Introduction ..................................................................................................................... 187 4.2. Measurements and Case Study: ...................................................................................... 193 4.3. Methodology ................................................................................................................... 208 4.4. Result .............................................................................................................................. 218 4.5. Discussion ....................................................................................................................... 242 4.6. Conclusion ...................................................................................................................... 247 REFERENCES ....................................................................................................................... 250 APPENDIX ............................................................................................................................ 258 CONCLUSION ........................................................................................................................... 264 v FUTURE RESEARCH DIRECTIONS ...................................................................................... 267 vi CHAPTER 1: INTRODUCTION AND LITERATURE REVIEW 1. 1 1.1. Introduction Satellite-based remote sensing (RS) through optical/thermal sensors and synthetic aperture radar (SAR) (e.g., Sentinel-1A/B (Torres et al., 2012), Sentinel-2A/B (Drusch et al., 2012a), and Landsat-8 (Roy et al., 2014)) has revolutionized our ability to collect vast amounts of open-access images at various temporal, spectral, and spatial resolutions. This has led to the creation of Big Data on a wide range of geophysical and biophysical features across the Earth's surface (Adrian et al., 2021; Kussul et al., 2017). However, analyzing extensive time-series RS data and extracting features by understanding sequential relationships for classification applications has consistently presented challenges to scientists (Zhong et al., 2019). The emergence of machine learning (ML) techniques has significantly enhanced scientists' capabilities in refining crop type mapping processes. Nevertheless, these methods often heavily rely on comprehensive feature engineering and the use of external indices (Zheng and Casari, 2018), and struggle to capture the dynamic temporal behaviors of crop classes, which fluctuate according to seasonal cycles (Wang et al., 2021). The advent of cloud computing has significantly empowered the RS community to explore new avenues for creating classification maps, particularly by harnessing the sophisticated capabilities of Deep Learning (DL) algorithms (Brown et al., 2022; Ma et al., 2019). Unlike traditional ML methods, deep neural networks offer an advanced methodology through multiple interconnected layers that facilitate automatic feature extraction and representation learning (Kamilaris and Prenafeta-Boldú, 2018). They also excel at identifying both spatial and temporal relationships within RS data, from the level of individual pixels to broader parcel scales, greatly improving the accuracy of models that depict the complex dynamics of crop phenology (Han et al., 2023). 2 While DL techniques have been successfully applied with multispectral sensors for various agricultural tasks, such as crop classification (Dong et al., 2016; Drusch et al., 2012), monitoring (Katal et al., 2022), and yield prediction (Qiao et al., 2021; Wang et al., 2023), the integration of SAR imagery has opened up new possibilities for enhancing these applications by offering a consistent acquisition schedule and remains unaffected by cloud cover and the day-night cycle (Steele-Dunne et al., 2017). SAR data overcome the limitations of multi-spectral sensors, such as susceptibility to cloud cover, background interference, aerosol effects, and saturation in regions of high biomass (Soudani et al., 2008). Furthermore, SAR observations are sensitive to water under the canopy, such as in the stem and ears, which is not detectable by multi-spectral sensors (Judge et al., 2021; Togliatti et al., 2022). Despite the significant potential of SAR data for agricultural applications, its usage comes with a multiple of challenges such as speckle effect, the complexity of information due to both amplitude and phase, geometric distortions inherent in the side-looking nature of SAR, and temporal decorrelation (Oveis et al., 2022). SAR data can be complex, noisy, and difficult to interpret, especially in agricultural applications where a wide range of factors can influence the signals received (Alemohammad et al., 2018). In this chapter, we discuss how DL can be instrumental in addressing these challenges and in extracting valuable information from a large stack of SAR images (Ma et al., 2014). To the best of our knowledge, this is the first review that explores the intersection of common and emerging DL techniques, SAR observations, and their classification, and monitoring applications in agriculture. 1.2. SAR in Agricultural Applications 1.2.1. Techniques SAR technology has proven to be a valuable tool for monitoring and assessing agricultural 3 landscapes. This section will delve into the key techniques employed in SAR systems for agricultural applications, namely backscatter, polarimetry, and interferometry. By examining the principles and applications of each technique, we aim to provide a comprehensive understanding of how SAR data are utilized to extract crucial information for crop classification, health, growth (i.e., tracking phenology), and management practices. 1.2.1.1. Backscatter Backscatter in SAR refers to the portion of the transmitted radar signal that is reflected back to the sensor by the target surface, providing information about the target's physical properties and structure. Three properties of the SAR imagery make them ideal for agricultural applications: (i) The SAR backscatter’s sensitivity to the dielectric properties, size, shape, orientation, roughness, and distribution of canopy (i.e., leaves, stems, and fruits, etc.) (McDonald et al., 2000), (ii) the ability of exact repeat with multi-temporal SAR observations to capture crop growth stages and crop structure variation enabling improved distinction among individual crops (Deschamps et al., 2012; McNairn et al., 2009), and (iii) the high spatial resolution (<=50m) of backscatter data that is instrumental in tracking crop growth/phenology and health status at a field scale. The SAR backscatter signal from vegetated surfaces primarily comprises three major first- order components: (i) surface scattering from the soil; (ii) multiple (volume) scattering from the canopy; and (iii) double-bounce scattering from the interaction between the canopy and the soil surface (Lopez-Sanchez and Ballester-Berman, 2009; Ulaby et al., 1996). Several factors influence the interaction between the active microwave signal and canopy structure, including SAR instrument characteristics, microwave frequency, and incidence and azimuth angles (Balenzano et al., 2010). Thus, SAR observations enable the characterization of unique structural attributes and dielectric properties of crop canopies, providing valuable insights for phenology tracking and crop 4 discrimination (McNairn et al., 2009). When considering the SAR configuration for agricultural applications, the choice of frequency is crucial. This decision is not straightforward and must take into account the canopy characteristics, such as crop type and development stage. Recent studies have shown that SAR backscatter data at X-band (~9.65 GHz) (Fontanelli et al., 2022; Lopez-Sanchez et al., 2011; Phan et al., 2018; Ryu and Lee, 2023), C-band (~ 5.6 GHz) (Canisius et al., 2018; Inoue et al., 2014; Lopez-Sanchez et al., 2013; Mascolo et al., 2015; Skakun et al., 2015a; Skriver, 2011), and L-band (~1.4 GHz) (Busquier et al., 2022; Huang et al., 2021; Khabbazan et al., 2022; Kim et al., 2018; Whelen and Siqueira, 2017) have the potential for crop classification and monitoring applications. Lower-frequency bands, such as the C- and L-band, are capable of penetrating deeper into the canopy, providing insights into the plant structure. Conversely, the higher-frequency X-band, due to its limited penetration ability, is more effective in correlating with surface-level canopy details, such as the weight of rice heads, highlighting its utility for precise measurements. However, while the X-band has shown potential for early growth monitoring and grain yield estimation, it faces challenges in accurately correlating with volumetric properties of the canopy that influence LAI and biomass (Inoue et al., 2002). Building on this understanding, Busquier et al. (2022) further established the comparative advantages of frequency bands for agricultural applications. Specifically, they found that for crop classification tasks, C- band data typically outperform X-band data. This superior performance is attributed to the C- band's optimal sensitivity towards both vegetation and soil moisture (SM) levels, crucial factors that significantly enhance the effectiveness of differentiating various crop types. However, C-band SAR signals, with their shorter wavelengths compared to L-band, interact more with smaller vegetation elements like leaves and small stems, making them suitable for discriminating 5 herbaceous crops such as wheat, alfalfa, and canola, even at moderate growth stages. In contrast, L-band SAR signals, having longer wavelengths, are less affected by the upper canopy layers and interact more with intermediate-sized crop elements like stems and leaf ribs of wide-leaf crops such as corn and sunflower. This L-band characteristics allows for better sensitivity to biomass in crops with low plant density. However, for crops with high plant density, both L-band and C-band provide useful information for biomass estimation, with C-band saturating earlier than L-band due to the significant contribution of leaves to backscatter at C-band. In fact, for broad-leaf crops, the leaf contribution to backscatter at C-band is significant and comparable to that of stems. Conversely, the leaf contribution is minimal at L-band for most crop types, as the longer wavelengths interact more with the larger plant structures (Ferrazzoli et al., 1997). The above- mentioned characteristics make L-band particularly well-suited for assessing vegetation properties such as biomass, structure, density, height, and vegetation water content (VWC), as well as SM beneath the canopy (Dobson et al., 1985). Although some studies not using DL techniques have shown that integrating different SAR frequencies can enhance crop classification accuracy significantly—reporting improvements up to 37% for early-season and 5% for end-season classifications (McNairn et al., 2014) —most SAR with DL studies (75 out of 82) focusing on crop classification, monitoring, and yield estimation predominantly utilized a single frequency. In cases where frequency integration was used, the comparison results between multi-frequency and single frequency were not reported. Additionally, Busquier et al. (2022) suggested that having a longer time-series of images, even from a single frequency band, can be as beneficial as combining data from multiple frequency bands with fewer images per band. Figure 1 illustrates the usage of various SAR platforms and frequency bands along with DL 6 across different agricultural applications. The C-band emerges as the most frequently utilized band, represented in 70 out of 82 studies (85%), followed by the L and P bands (4 studies each) and the X band (3 studies). The figure indicates a predominant use of the C-band frequency for classification/mapping tasks within agricultural contexts. Despite the L-band's potential advantages for crop monitoring and yield estimation, its application in these areas remains limited, with the C-band, especially images from Sentinel-1, being the favored option. This preference is largely attributed to the cost-free access to C-band Sentinel-1 imagery, making it a more feasible and attractive option for agricultural purposes. Figure 1.1: The stacked bar chart illustrates the distribution of SAR platforms used in conjunction with DL for various agricultural applications. Each bar represents the count for a specific application, and the height of the bar indicates the cumulative count across multiple SAR platforms. On the x-axis, 'CPIS' refers to the Center Pivot Irrigation Systems. In addition to frequency, the SAR signal's interaction with the crop canopy is influenced by the polarization of the signals transmitted and received by the SAR system. Polarization refers to the orientation of the electric field in the electromagnetic wave. Single-polarization SAR systems measure only one polarization (e.g., HH or VV), while dual-polarization SAR systems measure 7 two polarization combinations (e.g., HH and HV, or VV and VH). A signal with co-polarization, such as Vertical-Vertical (VV), exhibits heightened sensitivity to the vertical alignment of leaves (Le Toan et al., 1997). Conversely, a cross-polarization channel like Vertical-Horizontal (VH) demonstrates a stronger association with the Leaf Area Index (LAI) due to the volume scattering occurring within the crop canopy (Inoue et al., 2014; McNairn and Brisco, 2004). In SAR imagery, three principal metrics—Beta-naught (β0), Sigma-naught (σ0), and Gamma-naught (γ0)—quantify the returned radar backscatter. While Beta-naught measures backscatter in slant-range geometry, reflecting surface properties, it is less commonly used in agricultural applications. Sigma-naught and Gamma-naught are more relevant for agricultural studies, as they account for local incidence angle and terrain-induced variations, respectively. Sigma-naught corrects for local incidence angle, presenting backscatter in ground-range geometry as a natural normalization of beta-naught. Gamma-naught, adjusted for the plane perpendicular to the slant range, excels in areas of significant topographic variation, where beta-naught and sigma- naught may falter (Small, 2011). While most papers have used sigma-naught (𝜎0) as the input feature for their DL algorithms, several studies have highlighted the advantages of gamma-naught over sigma-naught for crop mapping and monitoring as they are less dependent on incidence angle (Lobert et al., 2023; Pandžić et al., 2024; Sonobe, 2019; Wicks et al., 2018). Sigma-naught is more commonly used for applications involving flat terrain, while gamma-naught is preferred when dealing with areas of varying topography (Small, 2011). Figure 1.2 presents time-series data of the backscatter coefficient (𝜎0) in both VV and VH channels, as well as their ratio, from Sentinel-1 C-band SAR signals for sixteen crop types. At any single point in time, two crops may exhibit similar backscatter values, but as the crop structure 8 evolves, particularly during seed and fruit development stages, the backscatter signature changes accordingly. By acquiring multi-temporal SAR data, these changes can be captured and analyzed to distinguish different crop types and monitor their growth. However, interpreting the sequential relationship in the SAR time-series can be challenging, and DL has been recognized as a valuable tool in addressing these challenges by learning spatial and temporal relationships of crops at the pixel or parcel level (Han et al., 2023). Figure 1.2: Time-series data for the backscatter coefficient (𝜎0) in both VV and VH channels, along with their ratio—all measured in decibels (dB)—across sixteen crop types. For each date, the mean value is depicted by a solid line, while the standard deviation is illustrated through a shaded region surrounding the mean (Adapted from Villarroya-Carpio et al., 2022). Preprocessing of SAR images is a critical step in improving their quality and enhancing their interpretability. Out of the 82 papers analyzed, 60 (75%) have conducted SAR preprocessing, while the remaining papers have used preprocessed SAR data from public or non-public datasets. The majority of these preprocessing efforts were focused on Sentinel-1 data, with only eight papers addressing RADARSAT-2 and AIRSAR. The preprocessing workflow has been presented in 9 Figure 1.5 flowchart. The majority of the papers employed Sentinel Application Platform (SNAP) software for SAR processing, with a few exceptions. Garnot et al. (2022) and Kussul et al. (2018) utilized the Orfeo toolbox and Sentinel-Toolbox (S1TBX), respectively, while Mei et al. (2018) used PolSARpro to generate coherency matrix (Sec 1.2.1.2) for AirSAR data. Recently, cloud-based platforms such as Google Earth Engine have gained popularity for SAR processing, as demonstrated by Paul et al. (2022), Ngo et al. (2023) and Y. Zhou et al. (2019). Most of the papers utilized Lee and Refined Lee filters for speckle filtering, making them the most commonly employed methods for reducing speckle effect in SAR imagery for agriculture. Similarly, the Shuttle Radar Topography Mission (SRTM) was frequently used as the digital elevation model (DEM) for terrain and geometric corrections. To summarize, the selection of SAR backscatter observables from specific bands as input features for DL classifiers was strategically tailored to align with the crop types, structural characteristics, and intended agricultural application. This included employing both co- polarization and cross-polarization, or their combinations, standardized into decibel scales. 1.2.1.2. Polarimetry In addition to single and dual-polarization SAR systems, fully polarimetric SAR (PolSAR) systems also known as quad-polarization systems, have the capability to transmit and receive electromagnetic waves in all four possible combinations of horizontal (H) and vertical (V) polarizations: HH, HV, VH, and VV. This allows the PolSAR systems to measure the complete scattering matrix of a target, which consists of four complex elements (S_HH, S_HV, S_VH, and S_VV) that describe how the target interacts with the incident electromagnetic wave and changes the polarization state of the scattered wave. By measuring the complete scattering matrix, PolSAR 10 systems provide valuable insights into the scattering mechanisms and physical characteristics of agricultural crops and the underlying soil surface (Hajnsek and Desnos, 2021; Lee and Pottier, 2017). However, interpreting the raw scattering matrix directly can be challenging. Therefore, to better understand and interpret the scattering behavior, polarimetric decomposition techniques are employed. Polarimetric decomposition is the process of breaking down the PolSAR scattering matrix into simpler, physically meaningful components that represent different scattering mechanisms such as surface, volume, and double-bounce scattering (Vicente-Guijalba et al., 2014). The temporal evolution of the relative contributions of these scattering mechanisms relates to crop growth stages, and therefore, they are effective features for crop classification application. In the early growth stages, surface scattering typically dominates as the SAR signal primarily interacts with the soil surface, double-bounce scattering can also occur if the crop has vertical structure or residues that facilitate this interaction. In the specific case of rice planting in flooded fields, double-bounce scattering becomes prominent right after transplanting, due to the radar signal reflecting off both the flat-water surface and the upright rice stalks. As the crop grows and the canopy develops, volume scattering becomes more prominent, since the SAR signal interacts with the leaves, stems, and fruits of the plants. This volume scattering is mainly related to VWC and LAI. Moreover, the combination of both double-bounce and volume scattering can provide valuable information for biomass and yield estimation, as it captures the density and structure of the crop canopy. To facilitate the interpretation of PolSAR data, various decomposition strategies have been proposed, including those by Cloude and Pottier. (1996), Freeman and Durden. (1998), Yang et al. (1998), Yamaguchi et al. (2005), Cameron and Rais. (2006), and Raney et al. (2012). Pauli decomposition, based on the Pauli matrices (Cloude and Pottier, 1996), is widely used to 11 break down the backscatter matrix into surface, double-bounce and volume scattering mechanisms. Cloude and Pottier also introduced a method based on eigenvectors and eigenvalues for deriving decomposition parameters from the coherency matrix (a 3x3 hermitian matrix derived from the scattering matrix that characterizes the polarimetric properties of a target), helping to identify entropy (H, indicating the randomness of scattering mechanisms), alpha angle (α, showing the main or average scattering mechanism), and anisotropy (A, evaluating the intensity difference between the second and third scattering mechanisms) (Lee and Pottier, 2017; Xu and Jin, 2005). Additionally, pedestal height, the ratio of the smallest to the largest eigenvalue, has been adopted to indicate the share of unpolarized scattering (Lee and Pottier, 2017). In addition to the Cloude-Pottier decomposition, model-based decomposition techniques such as Freeman-Durden (Freeman and Durden, 1998) and Yamaguchi (Yamaguchi et al., 2005) decompositions have been widely used for several decades. The Freeman-Durden decomposition conceptualizes the covariance matrix as deriving from three distinct scattering mechanisms, enabling identification of the predominant scattering types (Freeman and Durden, 1998; Lee and Pottier, 2017). Yamaguchi et al. (2005) enhanced this model by incorporating helix scattering power (the co-pol and the cross-pol correlations) as a fourth element. The m-chi decomposition technique, introduced by Raney et al. (2012) for lunar and astronomical studies, offers a valuable approach for analyzing polarimetric SAR data in various terrestrial applications, including agriculture. This decomposition is centered around two parameters: m, which measures the portion of the electromagnetic wave that is polarized, and chi, the Poincaré ellipticity parameter. Different types of crops and their varying conditions (e.g., healthy, stressed, different growth stages) can alter the degree of polarization of the radar signal (m), while the structural characteristics and orientation of crop leaves and stems, as well as the properties of the underlying soil, can influence 12 chi parameter. While most reviewed studies predominantly input raw linear polarization data (e.g., VV, HH, VH) into their DL classifiers, several have further enhanced classification accuracy in agricultural applications by effectively utilizing polarimetric decomposition parameters along with single and dual polarization measurements (Gu et al., 2019; Han et al., 2022; Komisarenko et al., 2022; Li et al., 2021; K. Li et al., 2022; Ma et al., 2022; Zhang et al., 2020). In addition to decomposition parameters from PolSAR data, various SAR indices have been used, such as Span, calculated as the sum of various polarizations (HH, VV, HV, and VH) and measuring the total backscattering strength from these polarizations (Yahia et al., 2020), cross ratio (CR) 𝜎𝑉𝐻 𝜎𝑉𝑉 0 0, the quad-pol Radar Vegetation Index (RVI) which measures the randomness of scattering mechanisms 8𝜎𝐻𝑉 0/(𝜎𝐻𝐻 0 + 𝜎𝑉𝑉 0 + 2𝜎𝐻𝑉 0) (Zhang et al., 2020), and was modified for dual-pol SAR data as 4𝜎𝐻𝑉 0 ⁄ (𝜎𝐻𝑉 0 + 𝜎𝐻𝐻 0) (Trudel et al., 2012), and Later adopted by several studies as 4𝜎𝑉𝐻 0 ⁄ (𝜎𝑉𝐻 0 + 𝜎𝑉𝑉 0) using Sentinel-1 dual-pol data (VV-VH) (Nasirzadehdizaji et al., 2019), Polarimetric Radar Vegetation Index (PRVI), (1-degree of polarization) × 𝜎𝐻𝑉 (Chang et al., 2018), Dual-pol radar vegetation index (DpRVI), (1-degree of polarization × normalized dominant eigenvalue) Mandal et al., (2020) and Dual Polarization SAR Vegetation Index (DPSVI), (𝜎𝑉𝑉 + 𝜎𝑉𝐻) ⁄ 𝜎𝑉𝑉 (Periasamy., 2018). Furthermore, Nasirzadehdizaji et al. (2019) introduced a new index, (𝜎𝑉𝑉 0-𝜎𝑉𝐻 0)/( 𝜎𝑉𝑉 0+𝜎𝑉𝐻 0), for estimating crop height and canopy coverage. This index was later utilized by Sun et al. (2022) for rice mapping using Sentinel-1 data. Mei et al. (2018) enhanced crop classification accuracy by combining RVI with eigenvalues from the H/A/Alpha polarimetric decomposition instead of linear polarization, formulated as 4λ3 ⁄ λ1 + λ2 + λ3 , and 13 from scattered power components derived from the Freeman-Durden decomposition, expressed as 𝐹𝑣 ⁄ 𝐹𝑣 + 𝐹𝑑 + 𝐹𝑠 , where 𝐹𝑣, 𝐹𝑑, 𝑎𝑛𝑑 𝐹𝑠 represent the volume, double-bounce, and surface scattering components, respectively. Texture features extracted from polarimetric SAR images is another measure that capture the structural characteristics of the target surface and its surrounding environment, providing insights into spatial variations in land cover. The Gray-Level Co-occurrence Matrix (GLCM) technique, introduced by Haralick et al. (1973), is widely used for texture analysis in polarimetric SAR images, which is a statistical method used to extract texture features from an image. It analyzes the spatial relationship between pixels by considering the frequency of occurrence of pairs of pixel values at a specified distance and orientation. Hoa et al. (2019) highlighted the importance of GLCM-derived textural features for soil salinity detection using polarimetric SAR imagery. While various studies have adopted combinations of different decomposition parameters and SAR indices as input features for DL classifiers, future research should focus on feature Selection, as will be detailed in section 1.3.1.3. This is crucial for minimizing feature redundancy and selecting the optimal parameters based on the intended application. 1.2.1.3. Interferometry In addition to SAR observables derived from measured backscattered intensity and polarimetric decomposition parameters, SAR also provide access to interferometric data. The technique involves the combination of pairs of exact repeat SAR images to produce phase measurements that relate to the vertical dimension of the scene being observed, among other properties (Bamler and Hartl, 1998). A critical element in interferometric SAR (InSAR) is the concept of interferometric coherence, a measure that reflects the quality of the interferometric phase and, by extension, the quality of the products derived from it. This coherence is influenced 14 by a variety of factors, including scene characteristics, sensor specifications, and the configuration of the interferometric pair itself (Zebker and Villasenor, 1992). Repeat-pass InSAR involves capturing SAR images of the same area at different times, allowing for the detection of changes within a scene. This technique has proven effective in measuring decreases in interferometric coherence, which often occur due to the rapid growth of plants or wind-induced movements, leading to temporal decorrelation, particularly over areas with agricultural crops (Rosen et al., 2000). The time gap between the two acquisitions can range from days to weeks or even months, depending on the satellite revisit time and the application requirements. The presence or absence of vegetation, as indicated by changes in coherence over time, provides insights into the agricultural calendar and, consequently, the types of crops present in a given area. This aspect of temporal decorrelation has been extensively explored through the use of time-series data from various SAR satellite sensors, enabling detailed mapping of crop types (Busquier et al., 2022). Villarroya-Carpio et al. (2022) established a strong correlation between the coherence measured in each polarimetric channel (VV and VH) and the NDVI, proposing that data from intensity and interferometry serve as complementary sources, a concept previously validated by Mestre-Quereda et al. (2020) in crop-type mapping. In addition to repeat-pass interferometry, single-pass interferometry has also demonstrated potential for crop classification and vegetation height estimation. Single-pass interferometry involves acquiring two SAR images simultaneously or within a very short time interval, eliminating temporal decorrelation effects. This technique is particularly useful for capturing the vertical structure of vegetation. Erten et al. (2016) found that single-pass interferometry provided valuable information about crop height and structure, especially when using large spatial baselines. Building on this, Busquier et al. (2020) showed that single-pass coherence can contribute to 15 improve classification accuracy in both dual-pol and single-pol cases, with more notable improvements observed for taller crops. The sensitivity to vertical structure provided by the physical baseline in single-pass interferometry allows for better discrimination between crop types based on their height and architectural differences. While both repeat-pass and single-pass interferometry offer valuable insights into crop characteristics, combining interferometric techniques with polarimetric information (PolInSAR) can provide enhanced sensitivity to vegetation structure and height. As reviewed by Romero-Puig and Lopez-Sanchez (2021), PolInSAR techniques have been successfully applied to crop height estimation, offering advantages over single-polarization interferometry or polarimetry alone. By exploiting the varying penetration depths of different polarizations, PolInSAR can more accurately locate the scattering phase centers within the vegetation volume. This allows for improved estimation of crop height, especially for taller or denser crops where single-channel approaches may saturate. While PolInSAR generally requires fully polarimetric data, which limits coverage compared to single-pol acquisitions, it provides a powerful tool for crop monitoring when such data are available. Despite its potential, interferometry coherence has been underutilized in crop mapping and monitoring using DL, being employed in two crop classification studies. Future research should concentrate on harnessing the potential synergies between intensity data and interferometric measurements to enhance crop mapping, monitoring, and the detection of agricultural management practices. 1.2.2. Applications The SAR techniques described in the previous section provide valuable insights into the physical properties and temporal dynamics of agricultural landscapes. These SAR-derived features serve as input to DL models, enabling the development of architectures for crop 16 classification/mapping, monitoring, and yield estimation. In this section, we review how SAR observables from linear polarization, PolSAR, and InSAR have been used as input features to DL approaches for the aforementioned applications. 1.2.2.1. Classification/Mapping SAR linear polarizations, specifically VH and VV, have been the subject of extensive investigation for their utility in crop mapping. Research, including studies by Asadi and Shamsoddini (2024), Liu et al. (2023), and Zhou et al. (2019b) highlights the superior performance of cross polarization (VH or HV) in identifying the majority of crops. However, a synergistic approach combining both VH and VV polarizations has been shown to provide a more detailed insight into crop characteristics, thereby improving model accuracy. This is supported by findings from Jo et al. (2022) and Y. Zhou et al. (2019), which demonstrated the enhanced crop classification accuracy achieved through the integrated use of VH and VV signals, as further evidenced by Magalhães et al. (2022), Paul et al. (2022) and Liu et al. (2023). Their research underscores the advantage of this combined approach over the exclusive use of either VH, VV, or their ratio. Additionally, the integration of various SAR polarimetric parameters from Pauli, Cloude and Pottier (H, A, alpha), Yamaguchi, and Freeman–Durden decomposition, as explored by K. Li et al. (2022), Mei et al. (2018), Yin et al. (2023), and Zeyada et al. (2016) extended the potential of SAR data for high accuracy crop mapping. These studies highlight the value of leveraging a multi-dimensional dataset to refine crop classification methods. McNairn et al. (2009) found that L-band polarimetric parameters derived from three decomposition approaches— Cloude–Pottier, Freeman–Durden, and Krogager— yielded higher crop classification accuracies relative to those achieved using the single and dual polarization. Further advancing this research, McNairn et al. (2014) showed that employing PolSAR technology over linear polarization could 17 improve Overall Accuracy (OA) of crop classifications by up to 7%. Extending these insights, Ma et al. (2022) confirmed that polarimetric decomposition parameters specifically enhance rice mapping accuracy by an additional 3% over traditional VV and VH polarizations. In addition to PolSAR data, the role of interferometric coherence in crop classification has gained attention. Busquier et al. (2022) further explored the synergistic effects of combining backscatter intensity with repeat-pass interferometric coherence, particularly emphasizing the value of C-band coherence in enhancing classification accuracies. Despite the lower performance of X-band coherence data due to quicker decorrelation, its fusion with C-band data significantly improved classification outcomes. Further, Ni et al. (2022) introduced the asymmetric coherence term or polarimetric ratio, which focuses on variations in polarimetric properties and radiometric changes between observations, whereas traditional coherence measures the temporal correlation and stability of scattering properties in SAR data. They illustrated that using asymmetric coherence can improve classification accuracy by 20% to 50% compared to traditional coherence-based methods. Consequently, given InSAR's potential to enhance crop classification accuracy, further research should focus on utilizing this SAR technique in conjunction with intensity measurements and polarimetry. 1.2.2.2. Crop monitoring: Phenology and Biophysical Parameters Estimation In the domain of crop phenology and BPs estimation, SAR data have proven to be a powerful tool, providing detailed insights into agricultural crop dynamics. Among the extensive set of linear and dual polarization SAR data, particularly VH, emerges as a critical feature due to its heightened sensitivity to volumetric scattering within crop canopies. This characteristic makes VH polarization particularly effective in distinguishing between different crop growth stages, as it adeptly captures the complex scattering interactions among canopy components, such as leaves, 18 stems, and branches. The superiority of VH polarization over VV polarization lies in its reduced sensitivity to factors like water, topography, and the canopy's morphological structure, as demonstrated in studies focusing on rice phenology estimation (Yang et al., 2021). In addition to linear polarization, SAR polarimetric decompositions and radar indices have established their significance by revealing key physical attributes closely associated with various crop BPs (Mandal et al., 2021). Notably, features like the RVI, Entropy, and the Alpha angle have been instrumental in delineating growth patterns across various crops, including soybeans and onions (Kim et al., 2011; Mascolo et al., 2015). However, there are varied findings on the effectiveness of the cross ratio (VH/VV). Blaes et al. (2006) highlighted a diminished sensitivity of cross ratio to maize growth beyond certain LAI and VWC thresholds, and Hosseini et al. (2019) emphasized the significance of VH and VV backscatter over cross ratio for accurate winter wheat LAI and Canopy Chlorophyll Content (CCC) estimation. Conversely, studies like Mercier et al. (2020) demonstrated cross ratio's correlation with wheat LAI and the adept application of entropy in assessing wheat's VWC. Moreover, the comprehensive analysis by Canisius et al. (2018) employing a wide array of SAR features from RADARSAT-2, including both VV and VH backscatter coefficients alongside several decomposition parameters, highlighted the significant role of VH cross-polarization and the Alpha angle from Cloude-Pottier decomposition in tracking the growth stages of canola and spring wheat. Further, Mandal et al. (2020), presented a strong correlation between DpRVI and essential BPs such as Plant Area Index (PAI), VWC, and dry biomass (DB) across various crops, notably outperforming other indices in terms of performance. This approach's efficacy was further validated by Ge et al. (2023), who leveraged a combination of polarimetric decomposition techniques to discern characteristics crucial to rice phenology. 19 Collectively, these studies affirm the strategic importance of integrating multiple SAR features, particularly polarimetric parameters and VH polarization, in enhancing the accuracy and reliability of crop phenology and BPs estimations. 1.2.2.3. Yield Prediction Recent research has revealed the intricate relationship between SAR polarizations, frequencies, and their applications in crop yield predictions. The sensitivity of SAR backscatter to crop biomass and LAI, which directly correlates with crop yield, can be valuable for yield estimation. However, the relationship between SAR backscatter and crop biomass is not uniform and varies based on factors such as crop type, growth stage, and SAR sensor characteristics, including wavelength and polarization (Bouman and Hoekman, 1993). The structural differences among crops significantly influence how SAR signals interact with vegetation, which can affect the selection of SAR frequency, as discussed in section 1.2.2.1. Optimizing the timing and frequency of SAR acquisitions based on crop growth stages is crucial for accurate yield prediction. Studies have shown that SAR data acquired during the reproductive and ripening stages are most effective for estimating rice yield (Nguyen et al., 2016), while for soybean, data from the pod development and seed filling stages are more informative (Navarro et al., 2016). In the case of corn, SAR data from the late vegetative and early reproductive stages, such as tasseling and silking, have demonstrated potential for yield estimation (Fieuzal et al., 2017). SAR polarization is another critical factor affecting the sensitivity of backscatter to crop biomass and yield. Studies by Tesfaye et al. (2022), Tripathi et al. (2022), and Sharma et al. (2022) have demonstrated the superiority of VH polarization in predicting rice and wheat, respectively. Conversely, VV polarization has exhibited better performance for sugarcane stalk development, which serves as a critical reservoir for sucrose accumulation (den Besten et al., 2023). However, 20 the synergistic use of both VV and VH polarizations has been shown to enhance the robustness and precision of yield estimation models (Yu et al., 2023). Besides linear polarization, the highest correlations with LAI and biomass have been found for volume scattering components from polarimetric parameters indicative of multiple scattering events, pedestal height and RVI (Steele- Dunne et al., 2017). This suggests that incorporating these parameters into DL classifiers for yield prediction could improve their performance. However, for DL-based studies, only VH and VV SAR observables have been used as input features. This reliance is due to the inherent capability of DL models to automatically learn and extract relevant features from raw input data, such as VH and VV signals in SAR imagery. This capability negates the need for manually crafted SAR indices like the RVI, which depends on the ratio of VH to VV signals. By directly processing raw SAR observables, DL models can uncover complex patterns and relationships that predefined indices may not capture, potentially leading to more precise and efficient analysis of agricultural scenes. 1.2.3. Data Sources Out of the 82 papers reviewed, 11% (9 papers) utilized airborne SAR data, such as multi- temporal AirSAR, UAVSAR, and PolSAR images from the AgriSAR project (Figure 1). These airborne SAR systems offer high spatial resolution and flexibility in data acquisition, making them valuable for small-scale studies and algorithm development. However, the majority of the papers (89%) employed spaceborne SAR data, with Sentinel-1 (C-band) being the most commonly used in 80% (66 papers) of the cases. The widespread use of Sentinel-1 data can be attributed to its open access policy, systematic acquisition strategy, and global coverage. The Sentinel-1 mission, comprising two satellites (Sentinel-1A and Sentinel-1B, which was decommissioned on 23 December 2021), provides C-band SAR data with a revisit time of 6-12 days, making it an 21 invaluable resource for agricultural monitoring applications. Given the widespread use of spaceborne SAR data in conjunction with DL for agriculture, Table 1.1 offers a comprehensive overview of all the spaceborne SAR satellites. 22 Table 1.1: Overview of SAR Spaceborne Satellites: Specifications, Operators, and Data Accessibility. Name Mission ERS-1 JERS-1 ERS-2 RADARSAT- 1 ENVISAT- ASAR RADARSAT- 2 1991 to 2000 1992-1998 1995 to 2011 1995 to 2013 Spatial coverage global global global Regional * Spatial Resolution (meter) 10-30 18 10-30 10(FM),25(S M),10- 50(SWM), 30(SNM) Temporal Resolution (days) 35 in IM 44 35 in IM 24 (FM), 12 (SM), 72 (SWM) Band/ Frequency (GHZ) C-5.2 L- 1.275 C-5.3 C-5.3 2002 to 2012 global 30 35 in APM and IM modes C-5.6 2007 to now global 10(FM), 25(SM) 10- 50(SWM), 30(SNM), 3(UFM) 24 (SM), 3 (FM) C-5.405 Incidence Angle (°) Polarization 5-45 32-38 5-45 20-35(FM), 20- 45(SM) 10-60 (SWM), 30(SNM) 15-45(WSM, IM, APM),17-43 (GMM),23(WM), 20-44 (SNSM) 20-35(FM), 20- 45(SM),10-60 (SWM), 30(SNM), 3(UFM) VV HH VV HH HH, VV, HH/HV, VV/VH HH, VV, HV, VH global 1(Spotlight) 16 X-9.65 20-59 VV, HH COSMO- SkyMed TerraSAR-X/ TanDEM-X v1-2(2007), v3(2008), v4 (2010) to now TerraSAR -2007 to now, TanDEM- 2010 to now RISAT-1 2012-2017 global 1-50 Gaofen-3(GF- 3) 2014 to now ALOS-2 2014 to now Sentinel-1A, C A-2014 to now, C-2024 Swath of 10-650 km global 10-100 global 3-10 global 1-3 (SL and SM), 16 (SC) 11 25 X-9.65 C-5.35 1-10 1.5-3 C-5.405 L-1.27 C-5.405 46 12 12 23 Sentinel-1 B 2016-2021 global 3-10 C-5.405 20-45 20-55 12-55 20-50 20-40 20-45 HH, VV, HH/VV, HH/HV, VV/VH HH, VV, HH/HV, VV/VH, Quad-pol HH, VV, HH/HV, VV/VH HH, HV, VH, VV VV-VH (IW), VV- VH, VV or HH (strip mode), VV(WM) VV-VH (IW), VV- VH, VV or HH (strip mode), VV(WM) Table 1.1 (cont’d) Name Mission Spatial coverage NovaSAR-1 2018 to now global PAZ 2018 to now global Spatial Resolution (meter) 6 (SM), 20 (SC), 30-50 (SC wide) 1-3 (SL and SM), 16 (SC) SAOCOM-1A, B A-2018 and B- 2020 to now RCM 2019 to now Capella2-10 Umbra 2020-2023 2023-now Iceye NISAR BIOMASS Sentinel-1 NG ROSE-L Tandem-L 2023 2024 2024 2029 2028 2028 global 10 (SM) global global global 1-3 (SL), 50- 100 (SC) 0.5 0.25 Regional 0.5-3 global global global global global 3-10 50-200 1-5 <=5 1 for spot image Polarization HH or VV (SM), HH, VV and HV (SC and SC wide) VV, HH, HV, and VH HH, VV, HH/HV, VV/VH, Quad-pol HH, VV, HV, VH, Compact pol VV, VH VV, VV+VH HH, VV, HH/HV, VV/VH, Quad-pol HH, VV Quad-pol HH, VV, HH/HV, VV/VH, Quad-pol HH/HV, VV/VH, Quad-pol HH, VV, Quad-pol Temporal Resolution (days) Band/ Frequency (GHZ) S-3.1 – 3.3 Incidence Angle (°) 16-31.2 (SM), 11.29-32.01(SC), 11.82 – 31.18 (SC wide) X-9.65 L-1.215 20-55 18-50 C-5.405 33.63-35.93 16 11 16 4 <= 2 hour 6-12 hour X-9.3-9.9 X-9.2 – 10.4 1-22 12 365 4-12 3-6 16 X-9.65 L-(1.215-1.3) and S-3.2 P-0.435 C-5.405 L-1.2575 L-1.215–1.3 45-53 10-75 15-35 33-47 23-34 17-45 15-45 20-45 24 1.3. Use of Deep Learning in Agricultural Applications of SAR DL and SAR technologies have been widely utilized in the agricultural sector, with numerous studies conducted in various countries. Europe, the USA, Brazil, and China have emerged as key regions with a significant number of research efforts in this field. This trend highlights the widespread adoption of DL applications in diverse ecosystems, showcasing their adaptability for different vegetation types and agricultural applications. Upon a detailed review of the existing literature, it's clear that a substantial number of studies emphasize the use of SAR and DL techniques for classification/mapping application, particularly targeting the end-of-season crop mapping, most notably for rice (Figure 1.3 and Figure 1.4). Despite the potential for wider applications, it appears that there is a surprisingly limited utilization of contemporary DL techniques for crop monitoring (5 studies) and crop yield estimation (3 studies). A probable factor could be the relative scarcity of training datasets for these applications, which are critical for the optimal performance of DL algorithms. In this section, we delve into the potential of DL algorithms to mitigate the speckle effect in SAR images. Furthermore, we investigate the fusion of SAR and optical data, as research has shown that combining these two data sources can yield superior results compared to using SAR and optical data alone in various agricultural applications, such as crop classification, monitoring, and yield estimation. This underscores the potential benefits of integrating multiple remote sensing data sources within DL frameworks for agricultural purposes. Additionally, we explore feature selection methods, which enhance DL model efficiency and generalization by reducing data dimensionality and eliminating redundant features. We also examine both established and emerging DL architectures to gain a deeper understanding of their contributions to SAR applications in mapping/classification, including early- and end-season crop classification, crop 25 rotation, mapping of center pivot irrigation systems, and soil salinity mapping, as well as crop monitoring and yield prediction. Moreover, we discuss crucial implementation considerations, such as data collection and augmentation techniques, along with training and validation ratios, which play a vital role in the successful deployment of DL models for agricultural applications. Figure 1.3: Distribution of studies across SAR bands, agricultural applications, and DL architectures. Emphasis on C-band SAR for crop classification using LSTM and 2D-CNNs. CPIS: Mapping of Center Pivot Irrigation Systems. 26 Figure 1.4: Network analysis of interconnections between agricultural applications, SAR frequencies, and DL architectures in reviewed studies. Node size indicates number of studies. Att: self-attention mechanism; TL: transfer learning; CPIS: mapping of Center Pivot Irrigation Systems. 1.3.1. Data Processing Techniques 1.3.1.1. Speckle Filtering Speckle effect is a common phenomenon in SAR imagery that degrades image quality and hinders the accurate interpretation of dynamic crop phenology. Traditional speckle filtering methods often struggle to achieve a balance between noise reduction and preservation of fine details. However, DL approaches, particularly Convolutional Neural Networks (CNNs), have 27 shown remarkable success in addressing this challenge. CNNs can learn hierarchical features from SAR data, enabling them to effectively distinguish between noise and actual ground features. By training on large datasets of SAR images with varying levels of speckle effect, CNNs can learn to suppress speckle while retaining important spatial and textural information. This capability has led to significant improvements in SAR image quality, facilitating more accurate crop classification, monitoring, and yield estimation. Several studies in this review have explored the application of DL techniques for speckle filtering in SAR data. Mei et al. (2018) employed Simple Linear Iterative Clustering (SLIC) superpixel segmentation to reduce speckle effect by dividing the SAR image into smaller superpixel blocks. Furthermore, Adrian et al. (2021) introduced Denoising Convolutional Neural Networks (DnCNNs), which adapt to the specific noise characteristics within SAR images, preserving essential details while eliminating noise. Interestingly, some studies have highlighted the potential of DL algorithms to obviate the need for preprocessing SAR data. Garnot et al. (2022) demonstrated that the U-Net architecture, combined with temporal attention-based networks (will be discussed in Sec 1.3.2.1), could effectively learn features and patterns from vast raw datasets without requiring radiometric terrain correction or speckle filtering. This approach enhanced crop mapping models without the need for extensive SAR preprocessing. Similarly, Gargiulo et al. (2020) showed that the W-net architecture (will be discussed in Sec 1.3.2.1) could produce reliable segmentation maps without speckle filtering, reducing computational complexity. These findings underscore the potential of DL methods to not only effectively mitigate speckle effect in SAR data but also to potentially eliminate the need for certain preprocessing steps. Further research into the capabilities of DL in handling raw SAR data could lead to more 28 efficient and streamlined workflows for agricultural applications. 1.3.1.2. SAR and Optical Fusion Techniques Integrating SAR with optical data represents a pivotal advancement in agricultural monitoring, optimizing the strengths and mitigating the limitations of each sensor type (Ofori- Ampofo et al., 2021). This fusion approach confronts challenges such as integrating multi-band spectral reflectance with SAR backscatter intensity and overcoming differences in spatial resolutions and temporal characteristics. Evidence of the successful application of SAR and optical data fusion is abundant, with notable examples including the combination of Sentinel-1 with Landsat-8 (Kussul et al., 2018, 2017; Cué La Rosa et al., 2023), and Sentinel-2 datasets (Asadi and Shamsoddini, 2024; Komisarenko et al., 2022; Liu et al., 2021; Ngo et al., 2023; Saadat et al., 2022; C. Sun et al., 2019; Thorp and Drajat, 2021; Tripathi et al., 2022; Wang et al., 2020; Yu et al., 2023; Zhao et al., 2020, 2022). A comprehensive review of 82 studies in this field revealed that 45% (37 papers) utilized a combination of optical and SAR data, primarily for crop classification, while encompassing all studies related to yield and BPs estimation. This review also highlighted the wide array of optical data sources employed beyond Sentinel-2 and Landsat- 8, such as VENµS, RapidEye, ZY-3, Planet satellite imagery, Very High Resolution orthophotos, AVIRIS, ROSIS, and RGB data from the CNES/Airbus Pleiades satellite. The superiority of combining SAR and optical data over approaches relying on a single modality has been consistently validated across research, particularly for crop classification (de Albuquerque et al., 2021; Giordano et al., 2020; Ofori-Ampofo et al., 2021), crop monitoring (Lobert et al., 2023; Thorp and Drajat, 2021), and yield prediction (Yu et al., 2023). This fusion technique benefits from combining SAR’s detailed textural insights with the spectral richness of optical imagery, offering a broader spectrum of data for analysis. 29 Data fusion strategies can be broadly categorized into three approaches: Early fusion (input/pixel level), Mid fusion (feature/layer level), and Late fusion (decision-level). Early fusion combines SAR and optical data at the input level using specific methods to address gaps in optical images, such as interpolation. Mid fusion merges features from each source at an intermediate stage, facilitating the use of a single temporal model and reducing preprocessing efforts. Late fusion, on the other hand, focuses on combining the outputs from independently processed modalities, emphasizing class confidence scores for final decision-making. For a deeper understanding of various fusion methods, readers are encouraged to read the papers by Garnot et al. (2022), Ofori-Ampofo et al. (2021) and Weilandt et al. (2023). The choice between these fusion strategies largely depends on the desired outcomes, the specific characteristics of the datasets involved, and computational constraints. While most studies have shown a preference for early fusion due to its straightforward implementation, other studies compared all the three fusion methods to find the best one. Ofori-Ampofo et al. (2021) demonstrated the effectiveness of early fusion, especially under cloudy conditions, and proposed Layer-Level Fusion at Pixel Set Encoders (PSE) and Temporal Attention Encoder (TAE) (will be detailed in section 1.3.2.1) for better identification of minor classes. Garnot et al. (2022) explored late fusion with the same classifier PSE-TAE, augmented with auxiliary supervision and temporal dropout, finding it generally superior but noted that mid-fusion offers a pragmatic balance between accuracy and computational efficiency, being 20% faster than late fusion. This makes mid-fusion appealing for scenarios with computational constraints. Yuan et al. (2023) effectively tackle the high computational costs associated with late fusion by introducing a shared temporal encoder and a 'feature stacking' technique to the PSE-TAE classifier. This method consolidates temporal variation metrics from separately processed optical and SAR data, achieving a 60% 30 reduction in trainable parameters without compromising performance. This innovation retains the effectiveness of the DL classifiers used by Ofori-Ampofo et al. (2021) and Garnot et al. (2022), streamlining late fusion processes in RS applications. Ienco et al. (2019) also demonstrated that late fusion, involving features extracted separately from SAR and optical streams using two ConvGRU networks, outperformed other fusion techniques. However, Saadat et al. (2022) further confirmed the advantages of mid-fusion in rice mapping using CNNs. Consequently, the choice between mid- and late fusion becomes a strategic decision, influenced by the specific application and the DL classifier utilized, for optimal SAR and optical data integration. Common and emerging DL further confirmed the advantages of mid-fusion in rice mapping using CNNs. Consequently, the choice between mid- and late fusion becomes a strategic decision, influenced by the specific application and the DL classifier utilized, for optimal SAR and optical data integration. 1.3.1.3. Feature Selection Although DL models possess the ability to learn features automatically, providing them with a carefully selected subset of relevant features can help reduce the computational complexity and training time of the model. This is particularly important when dealing with large, high- dimensional datasets, such as multi-temporal SAR data. Consequently, to enhance classification accuracy, it is essential to minimize feature redundancy and avoid overfitting—a challenge noted by Zhang et al. (2020), yet addressed in only a limited number of studies concerning optimal SAR feature selection. Additionally, Zhang et al. (2021) observed that even with similar crop types, the most effective features for crop identification vary across different regions. To address this, Zhang et al. (2020) advocated for a tree structure-based feature selection algorithm that prioritizes features based on their calculated significance. Mei et al. (2018) improved classification accuracy 31 by optimizing the feature set through a quantitative index that evaluates the separability of crop types. Zeyada et al. (2016) enhanced classification accuracy by identifying the superior performance of polarimetric parameters from Pauli, Cloude–Pottier, and Freeman–Durden decompositions, in conjunction with fundamental backscatter coefficients. They determined that expanding the parameter set from three to twelve could minimize training errors and prevent overfitting. Similarly, Yu et al. (2023) and Lobert et al. (2023) evaluated the performance of different sets of features to select the optimal features for yield prediction and phenology stages estimation, respectively. The results indicated that the combination of SAR with optical and meteorological data was the most effective combination for both studies. Delving further into SAR features, Hashemi et al. (under review) recently investigated the impact of different SAR observable combinations on crop yield estimation. Their findings revealed that the fusion of VH polarization with climate data outperformed other feature sets, which included VV polarization, cross ratio, RVI, and incidence angle for corn, soybeans, and winter wheat yield estimation. The aforementioned studies collectively emphasize the crucial role of feature selection in enhancing the performance and efficiency of DL models when applied to SAR data in agricultural applications. By carefully choosing the most informative SAR observables and ancillary data prior to training DL models, researchers can optimize the models' ability to achieve reliable results in tasks such as crop classification, yield prediction, and phenology stage estimation. 1.3.2. Common and Emerging Deep Learning Modeling 1.3.2.1. Classification/Mapping a. End-Season Crop Classification The integration of SAR imagery with ML approaches has revolutionized the field of agricultural technology, particularly in the realm of precise crop classification at the pixel and 32 parcel level. Traditionally, these methods relied on stacking time-series SAR imagery as a composite of features and employing data mining techniques to differentiate various crop types (Han et al., 2023). This reliance on handcrafted features, however, has necessitated expert knowledge and often overlooked the nuanced spatio-temporal relationships inherent in time- series SAR data. Techniques such as Random Forest (RF) have shown proficiency in identifying predominant crop types but struggle when distinguishing less prevalent ones, attributed to their tendency for overfitting as the decision trees expand, particularly in multi-class scenarios (Jin et al., 2018). Moreover, traditional ML models face inherent limitations, such as their inability to effectively process sequences or time-dependent data due to their auto-regressive nature, which hinders their ability to generalize and adapt to new data (Katharopoulos et al., 2020). In contrast, DL models have emerged as a superior alternative, showcasing their capacity for multiscale feature learning and demonstrating a remarkable ability to generalize across diverse datasets (Olimov et al., 2023). The launch of the C-band-equipped Sentinel-1A and Sentinel-1B satellites in 2014 and 2016, respectively, has significantly accelerated the adoption of DL techniques alongside SAR data by providing widespread access to high-quality, freely accessible SAR data from these satellites. This development has enabled more comprehensive studies, specifically in crop classification, harnessing the power of SAR data and DL techniques for this application. The investigation commenced with the exploration of Multilayer Perceptrons (MLP), a class of artificial neural networks (ANNs) in which each neuron in one layer is connected to every neuron in the next layer. It generally consists of two or more layers that can separate nonlinear data (Mas and Flores., 2008). Sonobe et al. (2017) and Skakun et al. (2015) illustrated the efficacy of MLP for crop classification, achieving an OA exceeding 90%. Conversely, Zeyada et al. (2016) 33 highlighted that shallow ML methods, when applied to C-band SAR data, outperformed MLP in crop classification tasks. This insight underscored the limitations of MLP, particularly its less optimal performance in handling complex spatial and channel information inherent in SAR imagery. Consequently, the year 2017 witnessed a strategic shift towards Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to better capture the spatial and temporal dynamics of multi-temporal SAR imagery. CNNs, in particular, are known for their hierarchical structure, enabling high-accuracy classification and prediction tasks by learning spatial contextual representations through convolutional filters to realize end-to-end classification in large-scale SAR imagery (Oquab et al., 2014). Carranza-García et al. (2019) highlighted the unparalleled ability of CNNs to excel at processing minority classes in their research, distinguishing them from other ML methods. Notably, CNNs can be utilized in 1D for analyzing the channel or temporal dimension, in 2D for spatial dimension, or in 3D across channel, spatial and temporal dimension. Although 3D convolution has demonstrated a high accuracy in crop classification (Kussul et al., 2017), CNNs are rarely used as feature extractors for the temporal domain of remotely sensed time-series (Zhong et al., 2019b). However, highlighting the effectiveness of CNNs in temporal analysis, Asadi & Shamsoddini. (2024) demonstrated the superiority of 1D-CNNs over shallow ML methods by using the backscatter and polarimetric features from SAR time-series for crop mapping. Further supporting the advancements in CNNs application, Teimouri et al. (2022) corroborated earlier findings, demonstrating the superior performance of 3D-CNNs over 2D-CNNs and MLP. Their study underscored the importance of fine-tuning the kernel depth in 3D-CNNs to optimize classification accuracy which facilitated the successful learning of crop growth cycles and consequently, boosted the classification accuracy. To address the "curse of dimensionality" issue plaguing CNNs in handling high-dimensional SAR 34 data, stacked auto-encoder (SAE) was combined with 1D-CNNs to design a convolutional- autoencoder neural network (C-AENN) by Luo et al. (2022) and Guo et al. (2022). This model is taking advantage of dimension reduction capabilities of the SAE (Guo et al., 2020; Hinton and Salakhutdinov, 2006), and achieves a superior classification ability that surpasses standalone 1D- CNNs and SAE approaches, as well as traditional ML methods. Autoencoders are primarily linked with unsupervised learning, as they are designed to compress input data into a condensed representation and subsequently reconstruct it without requiring labeled data during the training process. Di Martino et al. (2022) successfully employed C-AENN to classify crops in an unsupervised manner, proving its effectiveness in extracting detailed agricultural classes from temporal SAR signatures. Furthermore, when incorporated into supervised learning frameworks, autoencoders can enhance classifier performance providing richer and more relevant data representation (Goodfellow et al., 2016). Di Martino et al. (2023) also demonstrated the utility of C-AENN in a semi-supervised context to identify and rectify labeling errors in crop type datasets. Unlike CNNs, fully convolutional network (FCN) avoids fully connected layers in favor of convolutional and pooling operations that facilitate pixel level crop classification by learning spatial relationships and generating predictions for each individual pixel in the input image (Long et al., 2015). While Cué La Rosa et al. (2018) and Mullissa et al. (2018) highlighted the superiority of FCN over patch-based 2D-CNNs, Cué La Rosa et al. (2019) have confirmed the superior performance of 3D-CNNs compared to 3D-FCN and traditional ML methods. Classifying pixels independently using FCNs, while considering their spatial pattern, is particularly effective when dealing with high-resolution SAR data where pixel-level classification is critical. In contrast, patch-based 2D-CNNs, which classify patches of pixels, can result in a loss of spatial detail and potentially lower classification accuracy. However, for 3D data, 3D-CNNs, which are capable of 35 learning hierarchical features across both temporal and spatial dimensions, appear to perform better than 3D-FCNs, despite the latter's ability to model spatio-temporal information. This could be attributed to 3D-CNN's ability to capture complex temporal patterns and dependencies in time series SAR data, which is crucial for distinguishing crops with similar backscatter characteristics but different temporal behaviors. The U-Net architecture, a variant of CNNs, has revolutionized the field of semantic segmentation by providing an innovative approach to preserving spatial integrity, which is crucial for accurate crop classification (Ronneberger et al., 2015). U-Net's unique structure allows it to effectively capture and integrate spatial information and contextual features across different scales. During the contracting/encoding phase, U-Net reduces the spatial dimensions while increasing the number of feature channels. In the expansive/decoding phase, it combines the feature information with the spatial information from the contracting path through skip connections, enabling precise localization of classified pixels (Wenger et al., 2022). This architecture also demonstrates robustness in handling imbalanced datasets, a common challenge in crop classification tasks (L. Ma et al., 2019). Recent advancements in U-Net have further enhanced its performance in crop mapping. Adrian et al. (2021) introduced a 3D U-Net method that learns local spatial and temporal features simultaneously by applying 3D convolution kernels throughout the crop growing season. This approach outperformed 2D U-Net, Squeeze-and- Excitation Residual Network (SegNet) (Badrinarayanan et al., 2017), and Random Forest (RF) models in terms of overall crop mapping accuracy. Following the validation of U-Net's effectiveness in rice mapping by Wei et al. (2021), Xu et al. (2021) integrated a Conditional Random Field (CRF) with U-Net, significantly improving the accuracy of field boundaries and plot compactness on a large scale. 36 Attention mechanisms have also been incorporated into U-Net to boost its performance. Ma et al. (2022) proposed an attention-gated U-Net architecture (Oktay et al., 2018) that outperformed DeepLab v3 (Chen et al., 2017), and traditional ML methods in rice mapping accuracy. Furthermore, M. Wang et al. (2022) augmented the U-Net model with a SegNet backbone and incorporated Object-Based Image Analysis (OBIA), resulting in high-resolution rice field mapping that surpassed the performance of a U-Net model based on a Residual Networks (ResNet) (Szegedy et al., 2017) backbone. However, in a recent comparative study by Ngo et al. (2023), U-Net with ResNet backbone surpassed DeepLab-V3+ utilized the Xception network (Chollet, 2017) by 1-3% in accuracy. Ngo et al. (2023) also evaluated the performance of two widely used ML methods in crop classification, XGBoost (Chen and Guestrin, 2016) and LightGBM (Ke et al., 2017), which employ gradient boosting algorithms. Interestingly, both ML methods achieved an accuracy of 92%, matching the performance of Linknet (Chaurasia & Culurciello, 2017), a CNN-based model that utilizes ResNet for feature extraction. However, U- Net demonstrated superior performance, surpassing XGBoost, LightGBM, and Linknet in rice mapping accuracy using SAR imagery. This supremacy of U-Net was challenged by Gargiuloet al. (2020), who demonstrated a 3% improvement in crop classification using W-Net over U-Net, LinkNet, Feature Pyramid Network (FPN) (T.-Y. Lin et al., 2017), and SegNet (Badrinarayanan et al., 2017) using Sentinel-1. Moreover, W-Net offers advantages in processing time, memory usage, and performance while mitigating the multiplicative speckle effect more efficiently and maintaining fewer parameters despite its additional convolution layers. The exploration of DL models in crop classification using SAR imagery continues to evolve, with RNNs marking a significant advancement in processing sequential data. Specialized for 37 analyzing multitemporal SAR data, RNNs, particularly Long Short-Term Memory (LSTM) networks and their bidirectional counterparts (Bi-LSTM), have been widely adopted for crop mapping. These models have shown remarkable accuracy in capturing temporal correlations and extracting multi-temporal features from time-series SAR data, significantly outperforming traditional ML approaches in mapping rice crops (Crisóstomo de Castro Filho et al., 2020; Wang et al., 2020). The inherent gating mechanism of LSTM allows for selective information retention or discarding within the hidden state layer, effectively addressing long-term dependencies and the challenge of vanishing gradients in input sequences—a common obstacle in traditional RNNs training (Graves and Graves., 2012). Building on this foundation, Y. Zhou et al. (2019) leveraged LSTM with GLCM feature extraction to achieve a 5% boost in OA over traditional ML methods. Despite these advancements, Qu et al. (2020) highlighted the superiority of 1D-CNNs over LSTM in crop classification. Furthermore, Lin et al. (2022) advanced the application of LSTM through Multi-Task Learning for extensive rice mapping, leveraging time-series SAR data from Sentinel- 1. Further, scientists started to use a simpler and computationally more efficient variant of RNNs, Gated Recurrent Unit (GRU) that outperformed LSTM in crop classification (Ndikumana et al., 2018; Ni et al., 2022). GRU can effectively model sequential data with fewer gates and parameters compared to LSTM (Cho et al., 2014). Ni et al. (2022) demonstrated that GRU outperformed LSTM and 1D-CNNs in processing temporal data, as well as FCN in modeling spatial polarimetric data. Bidirectional form of LSTM and GRU models process data in both directions— forward and backward—allowing them to incorporate information from both earlier and later data points in the sequence when making prediction. In contrast, Unidirectional-RNNs, process sequential data in a single direction, typically from the beginning of the sequence to the end. Therefore, Bi-LSTM and Bi-GRU surpassed LSTM and GRU in crop classification using SAR 38 imagery (Crisóstomo de Castro Filho et al., 2020; Ge et al., 2023; Sun et al., 2022). Additionally, Sun et al. (2022) highlighted the effectiveness of dual-branch BiLSTM (DB-BiLSTM) networks, which demonstrated marked superiority in large-scale rice mapping compared to conventional BiLSTM and RF methods. Ultimately, in a comparative analysis between CNNs and RNNs, Ni et al. (2022) demonstrated the superiority of FCN and 2D-CNNs over RNNs, with GRU outperforming LSTM slightly, and 2D-CNNs excelling beyond FCN. Therefore, scientists started to use hybrid architecture of CNN and RNN variants, e.g., ConvGRU and ConvLSTM, addressing an inherent limitation of LSTM and GRU- the loss of spatial context information when handling images – (Shi et al., 2015). Martinez et al. (2021) and Rogozinski et al. (2022) improved the accuracy of these models by adding Atrous Spatial Pyramid Pooling (ASPP) module to ConvLSTM. Moreover, the integration of the Bi-Tempered Logistic Loss (BiTLL) function during the ConvLSTM model's training phase enhances its robustness to noise in the training data, making the model less sensitive to outliers or incorrect labels by using temperature settings to modulate logistic loss, thus mitigating the impact of noise. This model achieved high accuracy in rice mapping using Sentinel-1 imagery and surpassed methods like GRU, 3D-CNNs, and basic ConvLSTM (Chang et al., 2022). Additionally, Rustowicz et al. (2019) demonstrated that a combination of 2D-Unet and ConvLSTM can surpass the performance of 3D-Unet. In this approach, SAR images are initially processed through a U-Net, which enriches the data with detailed spatial context before it is fed into the ConvLSTM. This preprocessing step enhances and structures the spatial features, emphasizing the most relevant information for the task. As a result, the ConvLSTM can more effectively detect and interpret temporal changes, leveraging the improved spatial detail provided by the U-Net. 39 Since the introduction of the attention mechanism in DL in 2014, its incorporation into various models has markedly enhanced their effectiveness. By enabling models to selectively concentrate on the most pertinent aspects of the input data for a specific task, this dynamic allocation of input features facilitated a more efficient learning of complex patterns and relationships (Bahdanau et al., 2014). Such improvements have been particularly beneficial in advancing the accuracy of crop classification. Notably, the integration of the attention mechanism into LSTM networks has significantly boosted temporal modeling capabilities, leading to state- of-the-art classification performance (Rußwurm & Körner, 2018). Chang et al. (2022) investigated the integration of attention mechanisms into ConvLSTM blocks to focus on specific parts of Sentinel-1 SAR images more relevant for rice field detection. Furthermore, Garnot et al. (2022), Ofori-Ampofo et al. (2021) and Weilandt et al. (2023) proposed a multi-modal crop mapping framework utilizing dense time-series of optical and radar data, combining pixel-set encoder and temporal self-attention (PSE-TAE) to achieve multi-source feature fusion and improve crop mapping accuracy. Similarly, the U-TAE architecture introduced by Garnot & Landrieu (2021) incorporated the spatial UNet-based architecture into TAE to achieve pixel-level crop mapping from a semantic segmentation perspective. Further Z. Han et al. (2023) developed spatio-temporal Multi-level Attention (STMA) model for crop mapping that surpassed conventional convolutional models for accurate and generalized crop mapping across various datasets. While 3D U-Net excelled in handling spatial and temporal information with its 3D convolutional kernels, outperforming ConvLSTM, U-TAE, LSTM, and RF, it was surpassed by the STMA. The STMA method, with its multi-level attention mechanism consisting of cascaded spatio-temporal self-attention (STSA) and multi-scale cross- attention (MCA) modules for effective spatio-temporal data processing, especially in noisy 40 datasets, and a novel learnable spatial attention position encoding, demonstrated superior performance in capturing the complex dynamics of crop phenology. The transformative impact of attention was further amplified with the introduction of the Transformer model in 2017 (Vaswani et al., 2017). The Transformer's novel "self-attention" mechanism marked a significant advancement in DL, particularly beneficial for tasks like crop classification. Unlike previous attention mechanisms that relied on sequential processing, the self- attention mechanism computes attention weights by comparing each element of the input sequence with every other element, allowing the model to determine the relative importance of each data point in the context of the entire sequence. This enables the Transformer to capture long-range dependencies more effectively. Moreover, the Transformer relies solely on the self- attention mechanism, dispensing with recurrent and convolutional layers, allowing for parallel processing of input data and making it more computationally efficient. These features make the Transformer particularly suited for analyzing temporal patterns in crop classification from time- series SAR imagery. However, the Transformer's dependency on large datasets for training poses challenges for its application in scenarios with limited data availability. Building upon the success of the Transformer model in natural language processing tasks, researchers have adapted this architecture for visual tasks, giving rise to the Vision Transformer (ViT). ViT interprets images as sequences of patches, offering a more natural approach to handling multiple channels than traditional CNNs. Li et al. (2022) demonstrated ViT's superior performance in crop classification compared to various models, including 2D-CNNs with Attention (CNN-Att), 2D-CNNs-LSTM (C-LSTM), and LSTM with Attention (LSTM-Att). To further enhance ViT's effectiveness in crop classification tasks, Li et al. (2022) introduced a hybrid ViT (H-ViT) that incorporates a temporal dimension. This innovative strategy harnesses 41 the spatial feature extraction capabilities of CNNs in the preliminary layers, coupled with the global attention mechanism of Transformers in subsequent stages, enabling comprehensive spatio-temporal analysis. Moreover, Li et al. (2022) proposed a multi-branch architecture that utilizes self-supervised contrastive learning with ViT to process and integrate data from various sources, outperforming H-ViT in crop classification. In conclusion, the incorporation of spatio-temporal DL methods has revolutionized crop classification using multi-temporal SAR images, leading to significant improvements in accuracy. Models such as 3D-CNNs and 3D-UNet have showcased their ability to effectively capture both spatial and temporal dependencies, surpassing the performance of traditional 2D approaches. Hybrid CNN-RNN architectures, like ConvLSTM, and the integration of 2D U-Net with ConvLSTM has proven to further boost their performance, enabling them to focus on the most relevant features for the classification task at hand. The introduction of ViT, especially when combined with self-supervised learning techniques, has demonstrated promising results in crop classification by efficiently integrating multi-source data. Moreover, the STMA model, with its innovative attention mechanisms, has excelled in capturing crop phenology dynamics and achieving accurate, generalized crop mapping across various datasets. Despite the demonstrated superiority of these advanced spatio-temporal models, our review reveals that they have been employed in a relatively small number of studies compared to more traditional DL architectures. As illustrated in Figure 1.3, LSTM, 2D-CNNs, 1D-CNNs, and 2D-UNet have been utilized in 23, 22, 14 and 11 papers respectively, while ConvLSTM and self-attention mechanisms have been applied in 9 papers each. Transformers and 3D-UNet have been used in 6 and 4 papers, respectively, and the ViT has been explored in only one paper. This disparity in the application 42 of these cutting-edge models highlights the need for future research to further investigate and leverage the potential of spatio-temporal methods for crop classification using SAR imagery. b. Early-Season Crop Classification Although a significant volume of research—totaling 60 studies—has thoroughly investigated the use of multiple SAR images for classifying crops at the end of their growing season, the potential for identifying crops during the early stages of growth has received comparatively less attention. Platforms such as Sentinel-1, known for their high temporal resolution (ranging from 6 to 12 days) and minimal delay in data acquisition, present promising opportunities for assessments at the initial stages of crop development. However, the task of identifying crops early in the season presents distinct challenges. The similar appearance of different crops in their early growth phases complicate the task of distinguishing between them. Moreover, variables like surface roughness and SM can influence the backscatter signals in early- season imagery, particularly when the crops have not reached full maturity, thus affecting the precision of early-season mapping across various crop types. As a result, the efficacy of early- season classification varies from one crop to another, highlighting the need for tailored approaches in this application. For instance, Kussul et al. (2018) demonstrated that winter rapeseed, along with spring and summer crops, could be distinguished with high accuracy (>85%) at least 2 months before harvest. In contrast, crops like winter barley and grassland could not be reliably discriminated before harvest. Despite these challenges, SAR imaging offers a more viable option for early-season analysis compared to optical imagery. This advantage is particularly evident in regions prone to cloud coverage during this period, which can significantly compromise the quality and usability of optical data. According to Kussul et al. (2018), employing SAR imagery over optical can improve the precision of early-season crop identification by as much as 43 5%. Additionally, studies by Weilandt et al. (2023) and Ofori-Ampofo et al. (2021) confirmed that the fusion of SAR and optical datasets surpasses the performance of using either type of data individually for early-season crop classification. Further research by Zhao et al. (2019) revealed that a 1D-CNNs model surpassed LSTM and GRU in early classification while GRU displayed high accuracy earlier than other classifiers for end-season classification. Contrarily, Rußwurm et al. (2023) enhanced LSTM models’ capability for early season classification by adding a decision head to evaluate prediction uncertainty. This approach enabled precise classifications of Barley, Wheat, Rapeseed, Orchards, and Corn up to 3 months before harvest, leveraging combined data from Sentinel-1, -2, and Planet satellites for superior accuracy. Further, 2D-CNNs was examined with combination of VH and VV SAR backscatter for early-season classification that let classify Soybean, Fallow, Cotton, Jowar, and Sugarcane 45 days before harvest, albeit with a 3.5% reduction in accuracy (Paul et al., 2022). Nonetheless, Fontanelli et al. (2022) showcased the superiority of 3D-CNNs compared to 1D- and 2D-CNNs using X-band VV and HH SAR data composition for early-season classification with 98.5% accuracy one month before harvest. However, they also noted a significant decrease in classification accuracy, by approximately 20%, when predictions were made three months before the harvest. A recent study by Weilandt et al. (2023) demonstrated that spatio-temporal transfer learning (Sec 1.4.2) using Transformers alongside a fusion of SAR and optical can classify crops 1 to 3 month prior to harvest. Notably, their model outperformed Heupel et al. (2018) by identifying Barley and Rye at least two months earlier in an unseen year. Their results, consistent with Kondmann et al. (2022), showed that using a CNN-based classification method, Rapeseed and Sugar Beet could be identified at least one month earlier, and Maize three months before harvest, similar to Rußwurm et al. (2023)'s findings. 44 Moreover, their model detected Wheat two months before harvest, a slight deviation from Heupel et al. (2018) who detected it 3 months prior using optical data. In recent study, Liu et al. (2023) managed to classify tobacco during the mid-growing period with over 85% accuracy using a combination of VH and VV features alongside an Attention LSTM FCN model. In conclusion, Simpler DL methods such as 1D-CNNs, GRU, and LSTM have proven effective for early season classification of major crops like corn and rice. For minor crops, however, more complex models like Transformers have showcased its efficacy. c. Crop Rotation Mapping Crop rotation mapping is a complex task that involves predicting the sequence of crops planted in a field over multiple growing seasons. This practice is crucial for sustainable agriculture, as it helps maintain soil health, reduce pest and disease pressure, and optimize nutrient management. Given the sequential nature of crop rotations, DL architectures that can capture temporal patterns and dependencies in time-series data are particularly well-suited for this task. Among the reviewed literature, Dupuis et al. (2023) employed a Sequence-to-Sequence LSTM (Seq2Seq-LSTM) model alongside SAR data for forecasting field-level crop rotation over multiple years. This Seq2Seq-LSTM model, distinct from traditional LSTM models by its encoder-decoder structure, is specifically designed to handle complex sequence-to-sequence transformations, offering enhanced capability in predicting the sequence of crops over successive periods. It forecasts the likely crops to be planted in future cycles, with predictions further refined through a Conditioned Probability model, showcasing the model's advanced ability to capture temporal patterns and transitions in crop cultivation practices. While the application of DL in crop rotation mapping using SAR data is still in its early stages, the potential for temporal and spatio- temporal DL architectures to advance this field is significant. 45 As discussed in section 1.3.2.1a, several temporal and spatio-temporal DL architectures, including LSTM, GRU, ConvLSTM, their combination with attention mechanisms, and Transformers, have shown promising results in crop classification. These architectures have the ability to process and learn from sequential data, making them potential candidates for future research in crop rotation mapping. d. Mapping of Center Pivot Irrigation System (CPIS) The study of the temporal dynamics of CPIS, which are influenced by factors such as cropping systems, irrigation practices, and tillage protocols, reveals significant challenges for their detection at a single point in time. However, leveraging multi-temporal SAR images effectively addresses these challenges by facilitating the tracking of changes in shape over time. Three key characteristics of SAR imagery underscore its suitability for CPIS detection. Its ability to penetrate cloud cover, the distinct backscatter signatures reflective of variations in SM, crop types, and growth stages within and outside of the designated areas, and its consistent and comprehensive temporal coverage. While previous research has investigated the utility of the U-net architecture at the pixel level with optical data for CPIS identification (de Albuquerque et al., 2020; Saraiva et al., 2020), advanced object detection models such as Faster R-CNN and Mask R-CNN were applied to optimal number of SAR observations to detect CPIS (de Albuquerque et al., 2021). Faster R-CNN introduces a Region Proposal Network (RPN) that streamlines the object detection process by generating region proposals directly from image features. Building on this, Mask R- CNN adds a segmentation mask prediction branch for each Region of Interest, enabling detailed object localization through precise pixel-level segmentation in addition to bounding box 46 identification and object classification. This makes it exceptionally suitable for tasks that require intricate object detailing and effective background differentiation. Future work in this domain should focus on expanding the methodologies and technologies applied to CPIS detection, leveraging the advancements in SAR imaging with DL models to enhance the accuracy of CPIS mapping. e. Soil salinity mapping Given the critical global issue of soil salinization, impacting an estimated 230 million hectares of irrigated land (Metternicht and Zinck, 2003), the need for efficient and accurate monitoring methods is crucial. This is particularly the case in dry seasons when salinity intrusion tends to worsen as river systems undergo significant reductions in water discharges. Such conditions underscore the necessity of meticulous monitoring and management of soil salinity to mitigate its adverse environmental effects (Hoa et al., 2019). RS technology, particularly SAR, offers a cost-effective and promising solution for the acquisition of soil salinity data (Huang et al., 2019). The effectiveness of SAR in soil salinity mapping is particularly attributed to its sensitivity to the soil's dielectric constant, a measure significantly influenced by the soil's moisture content and salinity levels. The dielectric constant, represented as a complex number, comprises real and imaginary components. The imaginary component, crucial for its association with the soil's ability to absorb energy, becomes a pivotal factor in soil salinity detection (Chandrasekaran et al., 2012). This sensitivity has enabled the accurate mapping of soil salinity across varied landscapes, as demonstrated by the correlation of in-field salinity measurements with Radarsat-2 data in semi-arid regions by Barbouchi et al. (2014). While the intersection of SAR and DL in soil salinity detection, is a relatively growing field with few foundational studies such as those by Nurmemet et al. (2018) and Zhang et al. (2020), the potential for future research is vast. These 47 studies, employing quad-polarization and dual-polarization decomposition analyzed through 2D- CNNs and 1D-CNNs respectively, underscore the promise of combining SAR and DL for soil salinity detection. Future research can build upon the existing groundwork by exploring different SAR data features, incorporating additional SAR frequencies, and examining spatio-temporal DL architectures for more precise detection of soil salinity on a large-scale or worldwide. 1.3.2.2. Crop Monitoring: Phenology and Biophysical Parameters Estimation Crop phenology, which tracks the growth stages of crops from planting to harvest, plays a pivotal role in dynamic crop monitoring (Richardson et al., 2013), precision agriculture (Gao et al., 2017; Jentsch et al., 2009), yield prediction (Yuan et al., 2016), and enhancing agricultural productivity (Jung et al., 2021; Weiss et al., 2020). Moreover, as highlighted in section 1.2, the characteristics of SAR data, which include its sensitivity to vegetation biomass, make it a valuable tool for correlating with and indicating the phenological stages of crops (McNairn & Brisco, 2004). Despite the joint utilization of SAR and optical to estimate BPs and crop growth monitoring (Mercier et al., 2020; Veloso et al., 2017b) direct feature stacking has not been successful in exploring the nonlinear complementary relationship between the two data types. This failure is primarily due to the complex nonlinear response exhibited by SAR and optical data in the temporal domain, influenced by crop phenology. ML methods, despite their formidable power across various classification tasks, inherently struggle with temporal sequence data. This limitation can hinder their ability to precisely predict the timing of phenological stages (Lobert et al., 2023). Conversely, DL methods, specifically RNNs, can effectively capture temporal dependencies and dynamics in SAR and optical data. Utilizing LSTM layers within a 1D U-Net 48 has proven effective in estimating different phenology stages of winter wheat by capturing the dynamics in the SAR gamma naught and optical time-series (Lobert et al., 2023). Recently multi-head attention mechanism employed in RNNs, particularly Transformers, has exhibited superior capabilities in capturing long-range contextual features. The integration of CNNs for features extraction and the fusion of SAR with optical data along with the use of a Transformer model for temporal analysis was explored by Zhao et al. (2022) to enhance the start and end of the growing season detection. Similarly, Thorp and Drajat. (2021) demonstrated the superiority of spatio-temporal DL model, ConvLSTM over Conv2D, LSTM and GRU in detecting/identifying tillering, heading and harvesting stages of paddy rice using SAR and optical fusion. While models integrating spatial and temporal analyses have demonstrated superior accuracy, other studies such as Han et al. (2022) and Hosseini et al. (2019) have applied 2D- CNNs and 1D-CNNs, respectively, for estimating LAI and Canopy Chlorophyll Content (CCC) in winter wheat. These approaches utilize a combination of SAR backscatter data, polarimetric features, and radar vegetation indices to achieve their results. 1.3.2.3. Yield Prediction Crop yield prediction is of great importance in ensuring food security and meeting the growing demand for crop production (Battude et al., 2016). This is a complex task due to various factors that affect crop yield, such as soil type, weather condition, cultivation practices (e.g., date of sowing, amount of irrigation and fertilizer, etc.), and biotic stress (Dadhwal, 2003). DL-based models are a powerful tool for extracting useful information from raw satellite imagery, enabling accurate crop growth monitoring and yield prediction. These models uniquely bypass the need for directly measuring challenging parameters such as planting schedules, irrigation, fertilizer supply and soil characteristics, traditionally crucial to crop models. Using DL with SAR to predict 49 crop yields represents an emerging area of research, currently evidenced by three published studies. Simple DL methods such as MLP (Tripathi et al., 2022), LSTM (Yu et al., 2023) and DNN with 3-6 hidden layers (Tesfaye et al., 2022) have been used with SAR imagery to predict rice and wheat yield. While Tesfaye et al. (2022) advocated for a fusion of SAR, optical, and meteorological data as the optimal approach for rice yield prediction. In contrast, Tripathi et al. (2022) emphasized the importance of soil health parameters—SM, Soil Salinity, and Soil Organic Carbon (SOC)—for enhancing wheat yield estimation. While other research indicated LSTM's superiority over ML methods in county-level yield prediction using RS and meteorological data (Barriguinha et al., 2022; Cao et al., 2021), Yu et al. (2023) found that Meta-Learning Ensemble Regression (MLER), an ensemble learning algorithm that integrates predictions from various ML models (Vanschoren, 2018), outperformed LSTM for small datasets and equaled its accuracy for larger ones. Complementing this, Tripathi et al. (2022) underscored the significance of dataset size for MLP's success in yield estimation. They observed that simple regression techniques outperformed MLP for smaller datasets, but an enhanced MLP with additional hidden layers surpassed other ML methods, including OLS, KNN, RF, DT, Ridge regression, and SVR, for larger datasets. Likewise, Tesfaye et al. (2022) illustrated that increasing DNN's hidden layers boosted wheat yield prediction accuracy. This indicates that carefully adding complexity through more hidden layers can uncover more detailed data patterns crucial for yield prediction, thereby enhancing the model's performance. A recent systematic literature review by Muruganantham et al. (2022) focused on the use of DL and RS for crop yield prediction reported CNNs, LSTM, and ConvLSTM as the most commonly used DL that were used with optical data for yield prediction. 3D-CNNs model was optimal for predicting soybean yield using optical imagery from sources like MODIS 50 (Abbaszadeh et al., 2022; Fernandez-Beltran et al., 2021; Gavahi et al., 2021; Qiao et al., 2021; Russello, 2018; Terliksiz and Altýlar, 2019) and ConvLSTM was superior compared to 2D-CNNs and LSTM in predicting soybean yield using MODIS data, weather information, land surface temperature (LST), and surface reflectance data (J. Sun et al., 2019). In a recent study, Hashemi et al. (under review) demonstrated that with a small dataset, 3D-CNNs and XGBoost (a traditional ML method) had comparable performance in maize, soybeans and winter wheat yield estimation. For future research leveraging advanced DL models such as 3D-CNNs, ConvLSTM, and attention mechanism using Transformers or in combination with the other DL models (was discussed in Mapping/Classification section, 1.3.2.1) holds promise for enhancing yield predictions across different crop varieties. However, given the complexity of these models, it is crucial to collect substantial reference data to serve as training datasets. 1.3.3. Implementation Consideration 1.3.3.1. Data Collection and Augmentation Techniques Methods for field data collection, such as point observations or plot-based data, frequently encounter difficulties in forming a direct spatial and temporal association with SAR data. This presents a significant challenge when producing adequate reference data for agricultural applications. Most DL studies in classification/mapping applications typically resort to visual interpretation of primary or secondary RS data for reference, or for delineating target classes in a GIS environment. Such interpretations may include identifying individual targets for agricultural object detection (Freudenberg et al., 2019), or demarcating vegetation elements as polygons for semantic or instance segmentation (Kattenborn et al., 2021). According to our review, several studies utilized visual interpretation using various resources. These included Landsat-8, (de Albuquerque et al., 2021; Wei et al., 2019) fine- 51 resolution RGB images and Pléiades satellite imagery (M. Wang et al., 2022), high-resolution Google Earth images (Crisóstomo de Castro Filho et al., 2020), Korea Multi-Purpose Satellite-2 (KOMPSAT-2), IKONOS images, and orthorectified aerial photos (Jo et al., 2020), optical data and physical characteristics (Ge et al., 2023). Data augmentation, used in 12% of the studies, enhances the size and diversity of training datasets. By introducing minor alterations or generating synthetic data, this technique improves the network's robustness in classifying unseen data. The literature employed various data augmentation strategies. Rotation and flipping techniques, such as horizontal, vertical, and 90- degree rotations, were used in studies by (Chamorro Martinez et al., 2021; K. Li et al., 2022; M Rustowicz et al., 2019; Rogozinski et al., 2022; Teimouri et al., 2019; M. Wang et al., 2022). The addition of noise, specifically Gaussian noise, was a strategy applied in the research by K. Li et al. (2022). Other strategies included solarization, as used by K. Li et al. (2022), and scaling or zooming, as employed in the study by Teimouri et al. (2019). 1.3.3.2. Training and Validation To ensure robustness, transferability, and prevention of overfitting, it's crucial to independently validate DL models before deployment. This ensures that they can effectively generalize beyond specific instances. To this end, supervised DL models require three distinct datasets: training, validation, and testing. The validation dataset is used during model training to tune model parameters, optimize hyperparameters, and implement early stopping mechanisms to mitigating the risk of overfitting, while the test dataset is used post-training to assess the final model's performance. It's essential that this validation doesn't solely depend on iterative shuffling of training and validation data, but rather, is based on entirely independent data unseen by the model. Usually, 20 to 30% of the reference data is set aside for independent validation and testing. 52 1.4. Challenges 1.4.1. Challenges in the Use of SAR in Agriculture Several challenges must be addressed before SAR observations can be effectively used for feature extraction in DL models: Dynamic Range Management: This involves dealing with the large dynamic range of SAR observations, which could be as high as 90 dB depending on the spatial resolution (Steele-Dunne et al., 2017). Dynamic Range Management in SAR observations is crucial for the stability of DL models like CNNs. These models are designed for data within a certain range (0 and 1 or -1 and 1), and large dynamic ranges can lead to numerical instability, causing issues like exploding or vanishing gradients. This can lead to slower convergence during training or even cause the model to fail to learn from the data. To mitigate this, dynamic compression techniques such as normalization and amplitude value thresholding are employed. Normalization scales data to a standard range, ensuring balanced input features, while amplitude value thresholding clips extreme values, effectively reducing the dynamic range (Metzler et al., 2020; Shi et al., 2022). Speckle filtering: Speckle effect, presents a unique challenge in the analysis of SAR images. Unlike additive noise, speckle is a form of multiplicative interference, which can significantly complicate the extraction of meaningful features from SAR images. Traditional edge and low- level feature detectors, which are typically designed to handle additive noise, may not be optimal for dealing with speckle effect. As such, specific techniques and adaptations are often required to effectively process SAR images. While enhanced speckle filtering techniques can help to mitigate some of the effects of speckle effect, the presence of scatter noise in SAR data remains a significant issue. This can lead to poor model performance, particularly at finer resolutions (e.g., 10 meters) (J. Li et al., 2022). One common approach to mitigate the impact of speckle effect is 53 to aggregate SAR images to a coarser resolution, such as 30-50 meters (typical agricultural plot size). This can help to reduce the impact of speckle effect and improve the performance of subsequent analysis. Incorporating plot-scale data can also help to reduce noise interference and further improve prediction results (Garioud et al., 2021). In addition to these preprocessing techniques, recent research has also explored the development of robust DL methods that are specifically designed to handle noise and other imperfections in data. One notable example is the combination of CNNs and LSTMs, which has been shown to effectively mitigate the speckle effect in SAR images (Mohan et al., 2021). CNN layers excel in spatially filtering out speckle effect by identifying and preserving essential structural details like edges, while LSTM layers enhance this process by ensuring temporal consistency and coherence across image sequences. This dual approach significantly improves SAR image quality by effectively removing noise while safeguarding critical image features. Further advancements have been introduced by Dalsasso et al. (2020) initially explored the use of transfer learning from pre-trained denoising models, as well as end-to-end training strategies specifically tailored for SAR despeckling. Building on this foundation, they introduced the SAR2SAR algorithm in 2021, which employs a semi-supervised strategy—starting with training on simulated speckle and then fine-tuning on real SAR image pairs with a change compensation mechanism. Further innovations include the self-supervised learning approach developed by Dalsasso et al. (2022), which uses a convolutional U-Net architecture to process single-look complex SAR data by exploiting the statistical independence between real and imaginary components. Meraoumia et al. (2023) extended this concept to leverage multiple SAR acquisitions, learning an effective despeckling model without requiring clean ground truth images.Imaging Geometry: 54 The unique range and azimuth coordinates, inherent to the SAR image generation process, pose challenges in terms of processing and data augmentation. The use of rotation as a data augmentation technique could result in distorted imagery due to these unique coordinates. Therefore, careful consideration is required when applying such techniques to SAR data. Phase Component Analysis: The phase component contains valuable information for training the DL model for crop classification and monitoring applications, and careful consideration is necessary when selecting nonlinear activation functions and loss functions. Activation functions introduce non-linearity into the model, allowing it to learn complex patterns. For processing phase information, the suitability of certain activation functions, such as ReLU, which only handles positive inputs, is limited. They risk ignoring critical phase information that falls below zero. However, normalization application ensures that all phase information is adjusted into a positive range, making it compatible with activation functions like ReLU and safeguarding against the loss of vital data. Loss functions, on the other hand, measure the discrepancy between the model's predictions and the actual values. When dealing with phase information, it's crucial to choose a loss function that can handle the cyclical nature of phase data. Mean Squared Error, for instance, might not be the best choice as it doesn't account for the cyclical nature of phase data, which can lead to inaccuracies. Therefore, the choice of activation functions and loss functions should be made carefully, considering the nature of phase information in SAR data. Orbit Variation: The use of SAR images from both ascending and descending orbits or combination of different sensors can introduce challenges due to varying incidence angles and azimuths between orbits. These differences can cause a periodic “orbit-bias”, requiring extra processing e. g., incidence angle correction algorithm for correcting such orbit effects (Navacchi 55 et al., 2022; Quast et al., 2023). It is worth noting that adding the incidence angle as a feature to the DL algorithm can help reduce the orbit effects. For instance, Han et al. (2022) used local incidence angle as a feature to 2D-CNN to reduce the orbit-bias effect of Sentinel-1 and Sentinel- 3 combination to estimate BPs. Additionally, the unique imaging principles of SAR introduce complexities in capturing the dynamic scattering characteristics of crops, significantly influenced by factors like irrigation schedules and planting times. These variables can disrupt the temporal consistency of SAR data, posing further challenges to spatio-temporal generalization efforts critical for accurate BPs, yield and agricultural management practices. A promising mitigation strategy involves the fusion of SAR with optical data, leveraging the complementary strengths of both data types to enhance model robustness against these variations. 1.4.2. Challenges in the Use of Deep Learning in Agriculture DL practitioners in agriculture face a multitude of challenges that can significantly impact the effectiveness of their models. These challenges can be broadly categorized into two main areas: data quality and availability and model design and implementation challenges. a) Data quality and availability challenges DL models' success in agriculture depends on the availability of high-quality, well-curated datasets (Zhu et al., 2021). While data augmentation can expand the volume of datasets, publicly available agricultural datasets frequently face limitations, requiring extensive, labor-intensive ground-based data collection efforts. In response to the challenge of scarce labeled data, research has delved into a wide range of strategies. Within this context, numerous studies have investigated how the limited size of reference datasets influences the accuracy of DL models. Advancements in DL methodologies, such as weakly supervised LSTM networks (Wang et al., 2020), Self- 56 Attention Mechanisms (Transformers) (J. Li et al., 2022), and Stacked Auto-Encoders (SAE) (Zhang et al., 2023), have demonstrated significant resilience in maintaining model accuracy with substantially reduced dataset size. For instance, innovative approaches have shown that reductions in labeled data by up to 90% may result in only minimal decreases in accuracy metrics (Wang et al., 2020). For example, employing Transformers has shown good performance maintenance with only 30% of the training dataset for NDVI construction from SAR data, relevant to crop classification improvements (J. Li et al., 2022). Remarkably, SAE-based methods have achieved classification accuracy of 98.6% with only 2% of the training dataset, and 94% accuracy with just 0.5%, illustrating a critical advancement in the efficient and effective training of DL models (Zhang et al., 2023). Some other studies have assessed the effectiveness of their developed DL methods for generalization across datasets from various regions and times, spanning both small and large sizes. The Geodesic Distance Spectral Similarity Model (GDSSM) was utilized alongside 1D- CNN to efficiently extract and utilize training samples from a limited dataset (H. Li et al., 2022). GDSSM identifies pixels with high similarity to labeled samples, effectively augmenting the amount of training data available. Furthermore, a Spatial Feature-based Convolutional Neural Network (SF-CNN) incorporating a dual-branch CNN structure was able to process groups of samples rather than individual samples that could expand the training set by combining different samples (Shang et al., 2022). Z. Han et al. (2023) demonstrated the generalization capability of CNNs and Transformers integration handling of multi-scale spatio-temporal features that maintained high accuracy even in regions or where data might be sparse or highly variable. Another popular technique to mitigate the limitation of scarce labeled data is transfer 57 learning (TL), which enhances the adaptability of DL models to new domains or tasks by utilizing pre-trained models, often referred to as 'pre-trained backbones'. This method starts with training a model on a vast and varied dataset, known as the source, followed by fine-tuning it for a specific, different domain, termed the target. Studies dealing with transferability of crop mapping models using SAR imagery can be divided roughly into three categories, those dealing with transferability in the temporal domain (Hu et al., 2022; Pandžić et al., 2024), those dealing with transferability in the spatial domain (Jo et al., 2022) and those dealing with transitioning to a different task (Jo et al., 2022). The combination of these categories, i.e., spatiotemporal transferability as a simultaneous method, is a particularly complex task and thus rarely seen in the literatures (Hao et al., 2020; Weilandt et al., 2023). This technique conserves resources by minimizing the need for extensive training datasets and computational power for the target task. Moreover, it boosts model performance through the strategic use of pre-acquired knowledge. The literature reviewed identified two primary strategies for TL. The 'shallow strategy' uses pre-trained low-level image features, fine-tuning only the final layers of the deep neural networks for task-specific features using relevant imagery. On the other hand, the 'deep strategy’ fine-tunes the entire network by back-propagating through all layers of the pre-trained network (Pires de Lima and Marfurt., 2019). However, the choice of layers to fine-tune depends on various factors, including the similarity between the source and target tasks, the complexity of the new task, the amount of available data for the new task, and computational resources. CNNs, known for their ability to identify and utilize hierarchical visual features, are especially adept at this form of learning, making TL a powerful tool for adapting models to new tasks with remarkable efficiency and effectiveness (Kattenborn et al., 2021). Numerous pre-trained backbones are available for popular CNN architectures (Tuia et al., 2016) such as Visual Geometry Group (VGG) (Simonyan and 58 Zisserman, 2014), ResNet (Szegedy et al., 2017), AlexNet (Krizhevsky et al., 2012), Densely Connected Convolutional Networks (DenseNet) (Huang et al., 2017), Inception (Szegedy et al., 2015), and Extreme Inception (Xception) (Chollet, 2017). However, a significant challenge arises when applying these backbones to SAR data. Unlike the 3-channel (RGB) images typically used for training these architectures, SAR datasets are rich in a greater number of features, including backscatter intensity, polarimetric decomposition parameters, coherence measures, radar indices, and observations across different bands. Some potential solutions to this challenge include the use of band selection or feature reduction algorithms (Rezaee et al., 2018). However, these approaches could lead to loss of potentially valuable information, which may affect the model's performance. Addressing the specific challenges posed by limited labeled data, 3D U-Net was evaluated by Jo et al. (2022) using fine-tuning encoder, decoder, and full model for paddy rice identification across different geographical regions. Among these, fine-tuning the encoder surpassed the other methods in both spatial and task-related TL. Further, Capability of Transformers for spatio- temporal TL for early-season crop classification was explored by Weilandt et al. (2023) using a Pixel-Set Encoder–Temporal Attention Encoder (PSE-TAE) DL model (Garnot et al., 2020). Their conclusion suggests that enhancing the model's adaptability to diverse weather conditions may be attained by including temporal TL and extending the training duration, rather than relying on the integration of weather data. However, Pandžić et al. (2024) showcased the superior performance of CNNs over Transformers and RF models in the context of temporal TL for crop classification, utilizing Sentinel-1 satellite imagery. The application of TL across all three models significantly enhanced classification accuracy within a new domain, with CNNs combined with TL exhibiting the most notable improvement. This outcome highlights the distinct advantage of 59 CNNs in leveraging TL to optimize crop classification results. Consequently, the successful application of temporal TL suggests that it may not be necessary to collect ground truth data annually. The interannual applicability of these trained models holds promise for both predicting future crop type distributions and reconstructing historical ones, as affirmed by Hu et al. (2022). Class imbalance in crop classification is another limitation that needs to be addressed before DL modeling. Yuan et al. (2023) introduced k-positive contrastive loss (KCL) (Kang et al., 2020) to handle imbalanced datasets in crop classification tasks. Specifically, the KCL approach works by randomly selecting K instances of the same crop within a batch of data to create a set of positive samples, illustrating its practical application in enhancing model performance under class imbalance conditions. If there are fewer than K instances of the same crop in the batch, all instances of that crop are used instead. This approach helps ensure that the model receives enough examples of each class to learn effectively, even when some classes are underrepresented in the dataset. Further, Cué La Rosa et al. (2023) asserted their solution to the class imbalance issue with the introduction of an inventive online deep clustering technique called Learning from Label Proportions with Prototypical Contrastive Clustering (LLP-Co). This approach effectively utilizes government-provided crop proportion data as priors, seamlessly integrating them into a contrastive learning framework. Generative Adversarial Networks (GANs) are a DL method recently applied to generate data for minority classes in crop classification (Mirzaei et al., 2023). A GAN consists of two components: a generator that produces synthetic data and a discriminator that differentiates between synthetic and real data. The generator aims to create data that the discriminator cannot distinguish from real data, improving through adversarial training. However, GANs may struggle with non-Gaussian distributions in tabular data. To address this, Mirzaei et al. (2023) introduced the Conditional Tabular GAN (CTGAN), specifically designed for tabular 60 data. CTGAN supports conditional generation and employs categorical embeddings, making it effective for both continuous and categorical variables. Despite needing substantial training data and being more time-consuming than traditional methods, CTGAN's ability to accurately mirror complex data distributions marks a significant advancement over conventional data generation techniques like Random Under-Sampling (RUS), Random Over-Sampling (ROS), and Synthetic Minority Over-sampling Technique (SMOTE), improving classification performance by 5% and offering tailored synthetic data creation to better represent minority classes. Resampling and cost- sensitive learning are other techniques that have been used to overcome the issue of imbalanced labeled datasets in the studies by Johnson and Khoshgoftaar. (2019) and Khan et al. (2017). b) Model Design and Implementation Challenges Beyond dataset constraints, selecting the optimal network architecture emerges as a pivotal challenge. The decision not only influences a model's ability to discern complex patterns but also its generalization capabilities. This is a delicate balancing act; too complex an architecture risks overfitting with smaller datasets, while too simple a model may underperform with larger or high- dimensional datasets. For example, Crop phenology detection, which involves measuring BPs such as LAI and VWC, generally suffers from limited reference data due to the requirement for destructive in-situ measurements. Consequently, the DL architectures that are appropriate for crop classification may not be optimal for crop phenology detection. Additionally, the impact of hyperparameters on model performance cannot be overstated, yet the practice often defaults to using standard settings, potentially overlooking opportunities to fine-tune models for optimal results. Training DL models effectively encompasses navigating through a myriad of challenges, such as overfitting, the vanishing or exploding gradient problem, each presenting unique hurdles 61 to model accuracy and generalizability. Overfitting, a common issue with deep architectures, arises from a model’s capacity to learn not just the underlying patterns but also the noise within the training data, thereby diminishing its performance on unseen data. This challenge is intricately linked to the structural complexity of DL models and the dimensionality of the input data (Carranza-García et al., 2019). Parallel to the issue of overfitting is the vanishing and exploding gradient problems, which directly impact the learning process. The vanishing gradient problem slows or halts learning as gradients diminish through layers, while the exploding gradient problem destabilizes learning with excessively large gradients. These issues highlight the delicate balance required in designing and training DL models to ensure stable and effective learning. Addressing these challenges, a suite of techniques such as regularization, dropout, early stopping, and batch normalization (BN) have been developed to enhance model robustness and prevent overfitting. Regularization adds a penalty to the loss function based on the complexity of the model, encouraging simpler solutions. Dropout randomly "drops out" a number of output features of the layer during training, making the network less reliant on any single feature and more robust to noise in the input data. Early stopping is a form of regularization that halts training when performance on a validation set stops improving, preventing the model from learning noise in the training data. BN is a technique that normalizes the inputs of each layer in a mini batch, reducing internal covariate shift and helping the model generalize better (Mikołajczyk and Grochowski, 2019). Similarly, to combat the vanishing and exploding gradient problems, several techniques has been employed. Activation functions such as Rectified Linear Unit (ReLU), which outputs the input directly if it is positive and zero otherwise, help ensure gradients neither vanish nor explode (Hu et al., 2021). Additionally, gradient clipping is used to prevent gradients from 62 becoming excessively large. Furthermore, the introduction of residual connections, or skip connections, allows gradients to bypass certain layers directly, thus mitigating the vanishing gradient problem (Yahya et al., 2023). However, the integration of these techniques demands careful consideration to avoid potential adverse interactions, exemplifying the complexity of optimizing DL models. This optimization extends beyond technique selection to encompass implementation costs and the practicalities of model training, which may be hindered by computational or hardware limitations (Chen et al., 2014; Christiansen et al., 2016). Despite these challenges, DL has gained significant popularity owing to various technological advancements. These include efficient data processing techniques, high-performance graphics cards, cloud-computing capabilities, and open data initiatives that offer annotated data. Such developments enable the efficient computation of numerous non-linear transformations of input data, thereby establishing the fundamental strength of DL: its capacity to learn end-to-end (Kattenborn et al., 2021). 1.5. Opportunities Despite the advancements in SAR and DL methods for agricultural applications, there are still several gaps and areas that require further exploration. Here are some potential gaps and directions for future work: 1.5.1. SAR Data Preprocessing The Alaska Satellite Facility (ASF) offers Sentinel-1A and Sentinel-1B Radiometrically Terrain Corrected (RTC) products at no cost, developed using the GAMMA software. These products provide a 10-to-30-meter spatial resolution in different scales (decibel, power, and amplitude) and radiometric units (gamma naught and sigma naught) (ASF, 2023). Additionally, ASF incorporates speckle filtering for these products. Intriguingly, despite the 63 availability of preprocessed Sentinel-1 RTC images, none of the reviewed papers have integrated them with DL techniques in agricultural applications. 1.5.2. Availability of Multi-Frequency Data from Future Missions Notably, spaceborne L-band SAR data availability has been constrained, predominantly sourced from airborne and Advanced Land Observing Satellite (ALOS) PALSAR platforms. These platforms provide data with coarse temporal resolutions, and unlike open-access datasets, they often require specific task submissions for access (Table 1). Our review indicates that, to date, no study has utilized ALOS PALSAR data in conjunction with DL for agricultural applications. Another source, the SMAP mission L-band SAR data, was only accessible for approximately 2.5 months during the summer of 2015. However, the launch of the NASA-ISRO Synthetic Aperture Radar (NISAR) satellite in 2024 and Radar Observing System for Europe - L-band (ROSE-L) in 2028 brings promising opportunities for utilizing SAR L-band observations (Table 1). With a temporal resolution of 12 days (exact repeat) and a spatial resolution of 10 meters, Sentinel-1, NISAR, and ROSE-L enabling the utilization of both C- and L-band data, offering similar spatial and temporal resolutions. The combination of C- and L-band SAR observations may offer significant advantages in crop analysis. Firstly, it can enhance the discrimination and classification of different crop types, particularly during early-season classification when image availability is limited, and the crops' structures are similar to each other. Integrating C- and L-band SAR data can also improve crop residue and tillage detection. C-band SAR data is sensitive to structural characteristics such as canopy height and biomass, while L- band SAR data responds to moisture content and vegetation water content. Thus, combining these two bands provides a comprehensive understanding of these agricultural management practices. Secondly, the fusion of C- and L-band SAR data can prove to be highly beneficial for phenology 64 and BPs estimation. Each phenological stages exhibits unique radar signatures due to changes in vegetation structure, biomass, and moisture content. By incorporating both C- and L-band SAR data, researchers can more precisely capture these phenological changes, leading to a more accurate representation of crop phenology, yield estimation. 1.5.3. Emerging Applications 1.5.3.1. Phenology, Biophysical Parameters, and Yield Estimation While SAR has shown promise for crop growth monitoring, BPs estimation, and yield prediction, there is a limited number of studies that have explored the integration of SAR and DL in these areas. The challenge often lies in the DL models' requirement for extensive training datasets, which are particularly difficult to compile for such specialized applications. To address this challenge, future research could explore various strategies to enhance the effectiveness of DL models for these applications, even when dealing with limited datasets. Among these strategies are data augmentation (was discussed in section 1.3.3.1), transfer learning (was discussed in section 1.4.2), and foundation models. Foundation models, which leverage self- supervised learning (SSL) techniques, can be particularly valuable as they do not rely on labeled datasets. Instead, they are pretrained using SSL methods and subsequently fine-tuned for specific tasks with smaller, labeled datasets. Recent advancements have seen the application of foundation models across a range of tasks, utilizing methods like contrastive learning, Masked Autoencoders (MAE), Masked Image Modeling (MIM), DINO, Bootstrap Your Own Latent (BYOL), Momentum Contrast (MoCo), and CACo Loss; along with Seasonal Contrast (SeCo) as SSL methods combined with Transformers or ViT (Wang et al., 2022) and various RS data types, including SAR, optical, and LiDAR. These applications encompass a wide range of domains, including forest monitoring (Bountos et al., 2023), image segmentation (Fuller et al., 2023), crop 65 mapping (Xu et al., 2024), and land cover classification (Prexl and Schmitt, 2023). Employing these techniques can significantly boost the accuracy and generalizability of models dedicated to yield prediction and BPs estimation, promising substantial progress in the integration of SAR and DL in agricultural monitoring and assessment. Figure 1.5 illustrates a comprehensive workflow that integrates SAR data and DL techniques for various agricultural applications, including crop classification, phenology, BPs retrieval, and yield estimation. Since the availability of reference data varies among these applications, with crop type data being more abundant compared to BPs, growth stages, and yield data, the workflow incorporates a fine-tuning step, where the parameters of a pre-trained crop mapping model are adapted to the BPs and yield estimation application. This TL approach leverages the knowledge gained from the crop classification task to improve the performance of the DL models in applications with limited reference data, thereby enhancing the overall effectiveness of the SAR-based DL framework in agricultural management. 66 Figure 1.5: A flowchart illustrating the integration of SAR and DL for agricultural applications. The flowchart highlights three major components: (1) pixel-wise crop classification from SAR imagery, (2) feature selection, multi-modal fusion, and selection and implementation of appropriate DL architectures, and (3) fine-tuning the crop classification model parameters for robust crop phenology estimation, BPs retrieval, and yield prediction. 67 1.5.3.2. Agricultural Management Practices Detection a. Planting and Harvest Dates Estimation Accurate planting and harvest dates estimation is crucial for optimizing crop yields, resource allocation, and adaptation to climate variations. However, traditional crop calendars may not account for dynamic field conditions (Hashemi et al., 2022), highlighting the need for robust, real-time estimation methods using RS data to improve agricultural planning and productivity. Several studies have used the potential of SAR in planting dates estimation (Phan et al., 2018b; Shang et al., 2020) and the impact of SAR-based planting dates on estimated crop yield using crop growth models (Hashemi et al., 2022). By integrating DL with SAR data, the estimation of planting and harvest dates becomes more precise and reliable. DL techniques effectively reduce noise and mitigate the effects of varying orbit combinations, enabling the utilization of high temporal resolution SAR data with revisit times of 6 days or less. This enhanced temporal resolution allows for the capture of subtle changes in crop growth and development, leading to more accurate detection of key phenological events such as planting and harvesting. b. Crop Residue and Tillage Mapping Crop residue management plays a vital role in maintaining soil health, reducing erosion, increasing fertility, and ensuring agricultural sustainability (Zheng et al., 2014). The determination of timing and variability of tillage across landscapes relies on utilizing multi- temporal imagery to provide a full picture of tillage patterns for a region (Zhang et al., 2012). The cloud-penetrating capabilities of SAR significantly enhance the potential for acquiring sequential imagery. Its sensitivity to surface roughness and moisture makes SAR ideal for mapping crop residue and tillage. Figure 1.6 illustrates the impact of corn residues on Sentinel-1 VH backscatter from both a plowed and non-plowed soybean farm. The green line represents the VWC measured 68 every 12 days based on the Sentinel-1 overpass in the summer of 2022 in two different soybean farms in Michigan. In the well-plowed field, the backscatter is consistent with the VWC pattern, indicating the soybean growth cycle. However, the presence of crop residue in the other field has caused abnormal behavior in the VH backscatter. Despite the impact of crop residue on SAR backscatter, there has been a lack of research investigating its effect on the performance of crop classification/mapping, monitoring, and BPs estimation. Since 2010, only TerraSAR-X (X-band) and Sentinel-1 (C-band) data have been used for conservation tillage monitoring (Zhang et al., 2024). Although the dual-polarization (HH and HV) backscattering coefficients of TerraSAR-X images have been employed to distinguish soil roughness differences due to tillage methods, these SAR observables have shown weak correlations with surface roughness (Pacheco et al., 2010). In a study by Cai et al. (2019), indices 𝜎𝑉𝑉 0/𝜎𝑉𝐻 0, and (𝜎𝑉𝑉 0-𝜎𝑉𝐻 0)/( 𝜎𝑉𝑉 0+𝜎𝑉𝐻 0) were found to be effective in winter wheat crop residue detection using regression methods. However, with the maximum R² value reaching only 0.4, there is a clear need for the application of DL and ML methods to improve accuracy. To address these limitations and effectively detect and analyze crop residue and tillage practices, future research should focus on developing robust DL algorithms that leverage the potential of SAR data. These algorithms should incorporate ancillary data, such as surface roughness and SM, to enhance the accuracy and reliability of the results. Furthermore, combining different SAR frequencies, such as C and L bands, can provide complementary information and improve the overall performance of crop residue and tillage mapping. By integrating advanced DL techniques, multi-source data fusion, and multi-frequency SAR data, researchers can develop more comprehensive and accurate methods for monitoring and understanding the complex dynamics of crop residue management and tillage practices in agricultural systems. 69 Figure 1.6: Impact of crop residues on cross-polarized Sentinel-1 backscatter from a soybean farm and its influence on crop growth cycle monitoring using SAR backscatter time-series. (a) Soybean farm with corn residue. (b) Soybean farm without corn residue (well plowed). c. Cover Crop Mapping Cover cropping is an essential agricultural practice that plays a vital role in promoting soil health and fertility (Reicosky and Forcella, 1998). By reducing nutrient leaching, cover crops contribute to long-term soil fertility and minimize nitrate losses (De Notaris et al., 2018). Minh et al. (2018) demonstrated the effectiveness of Sentinel-1 𝜎𝑉𝑉 0 and 𝜎𝑉𝐻 0 in detecting winter cover crop using LSTM and GRU, which outperformed the RF method. Additionally, Najem et al. (2023) highlighted the superiority of multi-level decision trees over RF in both cover crop mapping and the analysis of temporal patterns. While these studies provide valuable insights, there is a pressing need for further research to fully explore the potential of SAR and DL technologies in understanding the complex dynamics of cover cropping, ultimately leading to improved soil health, reduced environmental impact, and increased agricultural sustainability. 70 d. Grassland Mowing The accurate detection of grassland mowing events holds significant ecological and economic implications, given the multifaceted roles of grasslands (Reinermann et al., 2022). Beyond serving as a primary source of fodder for livestock (Holtgrave et al., 2023), grasslands are crucial in delivering a range of ecosystem services, encompassing carbon sequestration (Soussana et al., 2004), water filtration (Jankowska-Huflejt, 2006), and provision of habitats for a myriad of species. A salient challenge in monitoring grassland mowing pertains to the swift regrowth dynamics of grasses, necessitating a high-resolution satellite time-series for precise event identification. In this context, SAR emerges as a valuable tool, adeptly augmenting optical time-series by mitigating observational gaps attributed to cloud interferences. This integration, when synergized with meteorological data, can markedly amplify the precision in detecting mowing events, especially considering the intrinsic association between mowing patterns and specific weather conditions. Several studies have explored the SAR and its fusion with optical data for grassland mowing detection (De Vroey et al., 2021; Holtgrave et al., 2023; Reinermann et al., 2022; Schuster et al., 2011; Tamm et al., 2016; Voormansik et al., 2015). Komisarenko et al. (2022) demonstrated the efficacy of employing CNN and LSTM models with an innovative reject region mechanism for the reliable detection of mowing events throughout the growing season, utilizing a blend of optical, InSAR, and PolSAR satellite time-series data. The study highlighted the superior performance of CNN and LSTM models over traditional shallow ML methods. Moreover, the study emphasized the critical role of weather data, particularly precipitation, in the successful detection of mowing events. Despite these advancements, there is a notable gap in research that exploits DL in conjunction with SAR data for grassland mowing event detection. 71 1.5.4. Training Data The development of DL models for agricultural applications relies heavily on the acquisition of comprehensive and diverse training datasets. To effectively estimate crop phenology, BPs, yield, and detect agricultural management practices, these datasets must include a wide range of measurements, such as VWC, LAI, canopy surface water, yield metrics, and soil characteristics (e.g., moisture and roughness). However, compiling such datasets is a time-consuming and costly endeavor, presenting significant challenges to researchers and practitioners. To overcome these challenges, several methodologies have been proposed, including data augmentation, transfer learning, and self-supervised, unsupervised, or weakly supervised learning techniques. While these approaches offer potential solutions, the reliance on pre-trained models that may not be entirely suited for specific agricultural tasks can hamper model performance and limit advancements in the field. Addressing this issue requires the creation of publicly accessible, extensive, and diverse reference datasets that encompass a variety of agricultural scenarios. Such datasets would greatly facilitate research and development efforts in the application of DL models to agricultural problems. Some European countries have already taken steps towards creating these datasets by mandating farmers to report their cultivar types as a requirement for receiving financial support (Arias et al., 2020). This has led to the development of datasets like EuroCrops (Schneider et al., 2021), ZueriCrop (Turkoglu et al., 2021), BreizhCrops (Rußwurm et al., 2019), and others, which contain hundreds of thousands of labeled parcels and are invaluable for training high-quality ML and DL models. However, assembling a comprehensive, large-scale ground truth database requires involvement from higher public authorities to set guidelines on data collection, storage, usage, and access rights. Establishing these datasets not only aids in developing more precise and 72 efficient models but also supports their evaluation and enhancement. Moreover, access to standardized datasets driven by the community would encourage advancements and innovation in the use of DL with SAR based agricultural applications. Emphasizing this need, we encourage research institutions, academia, and industry stakeholders to collaborate and contribute towards the creation of these reference datasets. Following is a comprehensive overview of all the open- access ground reference datasets for crop classification that is provided in Table 1.2. Additionally, Table 1.3 is outlining the field campaign datasets for vegetation sampling, which can be instrumental in collecting training datasets for crop monitoring applications. 73 Crop type Research Article Table 1.2: Open-access Ground reference datasets for crop classification/Mapping. Product name Spatial coverage EuroCrops (combination of all available publicly crop self-declared reporting datasets) European countries: Austria, Belgium, Germany, Denmark, Estonia, Spain, France, Croatia, Lithuania, Latvia, Netherlands, Romania, Sweden, Slovenia, Slovakia Portugal, Time period 2015-2022 TimeSen2Crop Austria 2017-2019 Annual Inventory Crop Canada 2010-now CAWA Crop dataset type Uzbekistan and Tajikistan 2008, 2011, 2015, 2018 Meadow, Vineyards, Winter Barley, Barely, Potato, Winter Rye, Summer Barely, Fallow, Vegetables, summer oats, sunflower, soya, Millet, winter Durum, Hopes, Berries, Rapeseed, Fodder Roots, Oil seed crops, Maize, wheat, Sorghum 16 crop types, Barley, Wheat, Rapeseed, Corn, Permanent Sunflower, Orchards, Nuts, meadows, Temporary meadows, Grassland, Spring cereals, Legumes, Permanent plantations (includes Vineyards, Cherry Plantation, Apricots, Nectarines, Peach, Apples, Pears, and Plums) More than 16 crop types: Wheat, Barley, Canola, Corn, Soybeans, Oats, Peas, Lentils, Flaxseed, Rye, Potatoes, Beans, Mustard, Sunflowers, Fallow, and Pasture. 40 crop types and is dominated by “cotton” (40%) and “wheat”, (25%). Other crops: rice, maize, orchards, vineyards, alfalfa, potatoes and onions. Mali Crop Type Training Data-ground Mali Great African Food Company Crop Type Tanzania Eyes on the Ground Image Data Kenya Imagery Drone Classification Training Dataset for Crop Types Rwanda 2018-2019 DENETHOR Dataset Northern Germany 2018-2019 74 2019 Maize, Millet, Rice, and Sorghum 2018 field boundaries and crop types 2019 Georeferenced crop images along with labels on input use, crop management, phenology, crop damage, and yields 19 different land cover types. These land cover types were reduced to three crop types (Banana, Maize, and Legume), two additional non-crop land cover types (Forest and Structure) Wheat, Rye, Barley, Oats, Corn, Oil Seeds, Root Crops, Meadows, Forage Crops (Schneider et al., 2023) (Weikmann al., 2021) et - (Remelgado al., 2020) et (Nakalembe, 2021) (Great African Food Company, 2019) (Waithaka, 2022) et (Chew al., 2020; Rineer et al., 2021) (Kondmann al., 2021) et Research Article (Turkoglu et al., 2021) (Rußwurm al., 2019) et (Kerner et al., 2020) (Radiant Earth Foundation & IDinsight, 2022) (Team, 2022) Table 1.2 (cont’d) Product name Spatial coverage ZueriCrop an area of 50 km × 48 km in the Swiss cantons of Zurich and Thurgau Time period 2019 BreizhCrops Brittany region of France 2017 Crop type 116000 individual fields spanning 48 crop classes, and 28000 (multi-temporal) image patches from Sentinel-2 Barley, Wheat, Corn, Fodder, Fallow, Miscellaneous, Orchards, Cereals, Permanent Meadows, Protein Crops, Rapeseed, Temporary Meadows, Vegetables CV4A Kenya Crop Type Competition Kenya 2019 Maize, Cassava, Soybean AgriFieldNet Competition Dataset Uttar Pradesh, Rajasthan, Odisha and Bihar in northern India - A Fusion Dataset for Type Crop in Classification Germany World Cereal Project Germany and South Africa 107 in situ datasets around the world: USA, Canada, Brazil, Sri Lanka, Northern India, Central Asia, Sweden, Estonia, Latvia, Lucas, Europe, Lebanon, Egypt, Senegal, Niger, Mali, Burkina Faso, Brazil, Ethiopia, Rwanda, Sudan, Africa, Nigeria, Cameroon, Tanzania, Mozambique, Kenya, Zimbabwe, South Africa, Madagascar 13 classes in the dataset including Fallow land and 12 crop types of Wheat, Mustard, Lentil, Green pea, Sugarcane, Garlic, Maize, Gram, Coriander, Potato, Berseem and Rice. Nine crop types: Wheat, Rye, Barley, Oats, Corn, Oil Seeds, Root Crops, Meadows, Forage Crops 2017(South Africa) 2018-2019 (Germany) 2017-2021 Maize and Cereal including wheat, barley, and rye (Van Tricht et al., 2023) Crop Ground reference land USGS. Vietnam, Thailand. Cropland Data Layer (CDL)-USDA-NASS US India, USA, Indonesia, 2016-2017 Annual 1997-now Campo Verde Campo Verde municipality, Mato Grosso state, Brazil. 2015-2016 Rice, Maize, Barely, Alfalfa, fallow, Sugarcane, Cassava, Soybean, Palm, Cotton - 106 unique crop types 14 land use classes were detected: soybean, maize, cotton, beans, sorghum, NCC–millet, NCC-crotalaria, NCC-brachiaria, NCC-grasses, pasture, turf grass, eucalyptus, Cerrado (Boryan et al., 2011) (Sanches et al., 2018) 75 Table 1.3: Field campaign vegetation sampling datasets. Name Time period Spatial Coverage Eagle campaign 8-18 June 2006 Three sites in the Netherlands SAR Freq L, C X Crop type Measurements Reference one grass land and two forest area Land cover type AgriSAR Campaign 16 flights in 2006 Northeastern of Germany (DEMMIN test site) X, C, L winter wheat, winter rape, winter barley, maize, and sugar beet Crop type, SM, and in situ measurements of BPs Vegetation and Land Cover (Plant height, Ground cover, Stand density, Phenology, LAI, Green and dry biomass) Soil Moisture, Surface Temperature, Surface Roughness Crop height, density, number of leaves, LAI, VWC and soil moisture VWC, LAI and crop type, soil moisture (Su et al., 2009) (Skriver et al., 2011) (Jackson et al., 2004; Narayan et al., 2004) (Jackson et al., 2007) (Park et al., 2011; NASA, 2008) (Fang and Lakshmi, 2014) SMEX02 2002 (fly on 5- 8 days) SMEX03 July 2003 (fly on 6 days) SMAPVE X08 Fall 2008 (fly for 7 days every 1-3 days) walnut creek watershed area in Iowa (N: 42.389, S: 41.308, E:-93.017, W:- 93.913),Southern Great Plains (SGP) site C, S, L, P southern and northern part of Oklahoma around Stillwater and Chickasha (N: 37.02, S:34.37, E:-97.43, W:-98.39) C, L and P Maryland and Delaware (N:39.09, S:38.93, E:-75.55, W:-76.25) Soybean and corn, walnut Creek Soybean, alfalfa, and corn Soybean SMAPVE X12 June to July 2012 (6 days) Manitoba (N: 50.01, S:49.32, E:-97.62, W:-98.67) SMAPVE X15 August 2015 (every 2-3 days) SMAPVE X16 Summer 2016 Arizona (N:31.87, S:31.51, E:-109.84, W:-110.96)- walnut gulch experimental watershed Iowa (N:42.66, S:42.28, E:- 93.21, W:-93.58) and Manitoba(N: 49.79, S: 49.36, E: -97.75, W:-98.12) L L L L 55 Ag-land fields, 5 forested sites, Corn and Soybean, Landcover: cereals (32%), canola (13%), corn (7%), soybean (7%), grassland & pasture (16%) Crop height, stem diameter, number of leaves, VWC, soil moisture, surface roughness Walnut Corn and Soybean 76 Soil moisture, precipitation, vegetation and roughness sampling (Colliander et al., 2017) Crop density, height, and biomass Soil moisture and soil temperature (NASA, 2016 a,b) 1.6. Conclusion This comprehensive review has highlighted the transformative impact of SAR with DL on different aspect of agricultural applications. The Sentinel-1 satellite has been the most widely used SAR sensor in agriculture due to its open-access data with continuous temporal coverage. The combination of VH and VV backscatter, along with the inclusion of polarimetric parameters, and SAR indices has significantly enhanced the accuracy of crop classification and monitoring. However, feature selection remains crucial to prevent data redundancy and overfitting problems. The review revealed that L-band SAR, has not been widely used for monitoring and yield estimation due to the lack of freely accessible data of this sensor. However, the upcoming launches of the NISAR and ROSE-L satellites are expected to bridge this gap by providing L-band SAR data with high temporal and spatial resolution. End-season crop classification has been extensively covered, and numerous emerging DL methods such as ViT have been developed, leading to improved performance in this application. However, the scarcity of labeled data has hindered the application of DL in crop monitoring and yield prediction. Techniques such as Transfer Learning and self-supervised learning using foundation models can potentially address this issue by enabling the use of smaller datasets. Moreover, future research should focus on exploring the potential of these techniques in early- season crop classification, CPIS, and soil salinity detection, which have received less attention compared to end-season crop classification. Despite the challenges posed by the limited availability of reference data for training and validation, the integration of SAR with DL continues to revolutionize agricultural applications. Emerging applications, such as mapping crop residue, tillage, and cover crop, as well as detecting 77 grassland mowing and estimating planting and harvest dates, highlight the tremendous potential of SAR with DL in agriculture. However, to fully harness this potential, the availability of comprehensive training datasets remains a critical bottleneck. Therefore, a concerted effort from the research community is needed to gather and share high-quality, annotated datasets that can support the development of robust DL models for agricultural applications. 78 REFERENCES Abbaszadeh, P., Gavahi, K., Alipour, A., Deb, P., Moradkhani, H., 2022. Bayesian multi-modeling of deep neural nets for probabilistic crop yield prediction. Agric For Meteorol 314, 108773. https://doi.org/10.1016/j.agrformet.2021.108773. Adrian, J., Sagan, V., Maimaitijiang, M., 2021. Sentinel SAR-optical fusion for crop type mapping using deep learning and Google Earth Engine. ISPRS Journal of Photogrammetry and Remote Sensing 175, 215–235. https://doi.org/10.1016/j.isprsjprs.2021.02.018. Alaska Satellite Facility (ASF), 2023. Copernicus Sentinel data. https://search.asf.alaska.edu/#/ (accessed 4.6.24). Alemohammad, S.H., Konings, A.G., Jagdhuber, T., Moghaddam, M., Entekhabi, D., 2018. Characterization of vegetation and soil scattering mechanisms across different biomes using P- 107–117. polarimetry. band https://doi.org/https://doi.org/10.1016/j.rse.2018.02.032. Environ Remote SAR Sens 209, Arias, M., Campo-Bescós, M.Á., Álvarez-Mozos, J., 2020. Crop classification based on temporal signatures of Sentinel-1 observations over Navarre province, Spain. Remote Sens 12, 278. https://doi.org/10.3390/rs12020278. Asadi, B., Shamsoddini, A., 2024. Crop mapping through a hybrid machine learning and deep learning 101090. https://doi.org/https://doi.org/10.1016/j.rsase.2023.101090. method. Remote Appl Sens 33, Badrinarayanan, V., Kendall, A., Cipolla, R., 2017. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39, 2481–2495. https://doi.org/10.1109/TPAMI.2016.2644615. Bahdanau, D., Cho, K., Bengio, Y., 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. https://doi.org/10.48550/arXiv.1409.0473. Balenzano, A., Mattia, F., Satalino, G., Davidson, M.W.J., 2010. Dense temporal series of C-and L-band SAR data for soil moisture retrieval over agricultural crops. IEEE J Sel Top Appl Earth Obs Remote Sens 4, 439–450. http://dx.doi.org/10.1109/JSTARS.2010.2052916. Bamler, R., Hartl, P., 1998. Synthetic aperture radar interferometry. Inverse Probl 14, R1. http://dx.doi.org/ 10.1088/0266-5611/14/4/001. Barbouchi, M., Abdelfattah, R., Chokmani, K., Aissa, N. Ben, Lhissou, R., El Harti, A., 2014. Soil salinity characterization using polarimetric InSAR coherence: Case studies in Tunisia and Morocco. J Sel Top Appl Earth Obs Remote Sens 8, 3823–3832. https://doi.org/10.1109/JSTARS.2014.2333535. IEEE Barriguinha, A., Jardim, B., de Castro Neto, M., Gil, A., 2022. Using NDVI, climate data and machine learning to estimate yield in the Douro wine region. International Journal of Applied Earth Observation and Geoinformation 114, 103069. https://doi.org/10.1016/j.jag.2022.103069. 79 Battude, M., Al Bitar, A., Morin, D., Cros, J., Huc, M., Sicre, C.M., Le Dantec, V., Demarez, V., 2016. Estimating maize biomass and yield over large areas using high spatial and temporal resolution Sentinel-2 like remote sensing data. Remote Sens Environ 184, 668–681. https://doi.org/10.1016/j.rse.2016.07.030. Blaes, X., Defourny, P., Wegmuller, U., Della Vecchia, A., Guerriero, L., Ferrazzoli, P., 2006. C- band polarimetric indexes for maize monitoring based on a validated radiative transfer model. 791–800. geoscience IEEE https://doi.org/10.1109/TGRS.2005.860969. transactions sensing remote and 44, on Boryan, C., Yang, Z., Mueller, R., Craig, M., 2011. Monitoring US agriculture: the US department of agriculture, national agricultural statistics service, cropland data layer program. Geocarto Int 26, 341–358. https://doi.org/10.1080/10106049.2011.562309. Bouman, B. A. M., & Hoekman, D. H. (1993). Multi-temporal, multi-frequency radar crops during in The measurements of Netherlands. International 1595–1614. of https://doi.org/10.1080/01431169308953988. the Agriscatt-88 Sensing, 14(8), agricultural campaign Remote Journal Bountos, N.I., Ouaknine, A., Rolnick, D., 2023. FoMo-Bench: a multi-modal, multi-scale and multi-task Forest Monitoring Benchmark for remote sensing foundation models. arXiv preprint arXiv:2312.10114. https://doi.org/10.48550/arXiv.2312.10114. Brown, C.F., Brumby, S.P., Guzder-Williams, B., Birch, T., Hyde, S.B., Mazzariello, J., Czerwinski, W., Pasquarella, V.J., Haertel, R., Ilyushchenko, S., 2022. Dynamic World, Near real- time global 10 m land use land cover mapping. Sci Data 9, 251. https://doi.org/10.1038/s41597- 022-01307-4. Busquier, M., Lopez-Sanchez, J. M., Mestre-Quereda, A., Navarro, E., González-Dugo, M. P., & Mateos, L. (2020). Exploring TanDEM-X interferometric products for crop-type mapping. Remote Sensing, 12(11), 1774. https://doi.org/10.3390/rs12111774 Busquier, M., Lopez-Sanchez, J.M., Ticconi, F., Floury, N., 2022. Combination of Time Series of L-, C-, and X-Band SAR Images for Land Cover and Crop Classification. IEEE J Sel Top Appl Earth Obs Remote Sens 15, 8266–8286. https://doi.org/10.1109/JSTARS.2022.3207574 Cai, W., Zhao, S., Wang, Y., Peng, F., Heo, J., Duan, Z., 2019. Estimation of winter wheat residue images. Remote Sens 11, 1163. coverage using optical and SAR remote sensing https://doi.org/10.3390/rs11101163. Cameron, W.L., Rais, H., 2006. Conservative polarimetric scatterers and their role in incorrect extensions of the Cameron decomposition. IEEE transactions on Geoscience and Remote Sensing 44, 3506–3516. https://doi.org/10.1109/TGRS.2006.879115. Canisius, F., Shang, J., Liu, J., Huang, X., Ma, B., Jiao, X., Geng, X., Kovacs, J.M., Walters, D., 2018. Tracking crop phenological development using multi-temporal polarimetric Radarsat-2 data. Remote Sens Environ 210, 508–518. https://doi.org/10.1016/j.rse.2017.07.031. 80 Cao, J., Zhang, Z., Tao, F., Zhang, L., Luo, Y., Zhang, J., Han, J., Xie, J., 2021. Integrating multi- source data for rice yield prediction across China using machine learning and deep learning approaches. Agric For Meteorol 297, 108275. https://doi.org/10.1016/j.agrformet.2020.108275. Carranza-García, M., García-Gutiérrez, J., Riquelme, J.C., 2019. A framework for evaluating land use and land cover classification using convolutional neural networks. Remote Sens 11, 274. https://doi.org/10.3390/rs11030274. Castro, J.B., Feitosa, R.Q., Happ, P.N., 2018. An hybrid recurrent convolutional neural network for crop type recognition based on multitemporal SAR image sequences, in: IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium. IEEE, pp. 3824–3827. https://doi.org/10.1109/IGARSS.2018.8517280. Chandrasekaran, S., Ramanathan, S., Basak, T., 2012. Microwave material processing—a review. AIChE Journal 58, 330–363. https://doi.org/10.1002/aic.12766. Chang, J.G., Shoshany, M., Oh, Y., 2018. Polarimetric radar vegetation index for biomass estimation in desert fringe ecosystems. IEEE Transactions on Geoscience and Remote Sensing 56, 7102–7108. https://doi.org/10.1109/TGRS.2018.2848285. Chang, Y.-L., Tan, T.-H., Chen, T.-H., Chuah, J.H., Chang, L., Wu, M.-C., Tatini, N.B., Ma, S.- C., Alkhaleefah, M., 2022. Spatial-temporal neural network for rice field classification from SAR images. Remote Sens 14, 1929. https://doi.org/10.3390/rs14081929. Chaurasia, A., Culurciello, E., 2017. Linknet: Exploiting encoder representations for efficient semantic segmentation, in: 2017 IEEE Visual Communications and Image Processing (VCIP). IEEE, pp. 1–4. https://doi.org/10.1109/VCIP.2017.8305148. Chen, J.-L., Kuo, C.-C., Chen, L.-G., 2014. Region-of-unpredictable determination for accelerated full-frame feature generation in video sequences, in: 2014 IEEE Visual Communications and Image Processing Conference. pp. 434–437. https://doi.org/10.1109/VCIP.2014.7051599. Chen, L.-C., Papandreou, G., Schroff, F., Adam, H., 2017. Rethinking atrous convolution for semantic image segmentation. https://doi.org/10.48550/arXiv.1706.05587 Chen, S.-W., Tao, C.-S., 2018. PolSAR image classification using polarimetric-feature-driven deep convolutional neural network. IEEE Geoscience and Remote Sensing Letters 15, 627–631. https://doi.org/10.1109/LGRS.2018.2799877. Chen, T., Guestrin, C., 2016. Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining. pp. 785– 794. https://doi.org/10.1145/2939672.2939785. Chew, R., Rineer, J., Beach, R., O’Neil, M., Ujeneza, N., Lapidus, D., Miano, T., Hegarty-Craver, M., Polly, J., Temple, D.S., 2020. Deep neural networks and transfer learning for food crop identification in UAV images. Drones 4, 7. https://doi.org/10.3390/drones4010007. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, 81 Y., 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. https://doi.org/10.48550/arXiv.1406.1078. Chollet, F., 2017. Xception: Deep learning with depthwise separable convolutions, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1251–1258. https://doi.org/10.48550/arXiv.1610.02357. Christiansen, P., Nielsen, L.N., Steen, K.A., Jørgensen, R.N., Karstoft, H., 2016. DeepAnomaly: Combining background subtraction and deep learning for detecting obstacles and anomalies in an agricultural field. Sensors 16, 1904. https://doi.org/10.3390/s16111904 Cloude, S.R., Pottier, E., 1996. A review of target decomposition theorems in radar polarimetry. IEEE 498–518. https://doi.org/10.1109/36.485127. transactions geoscience sensing remote and 34, on Colliander, A., Cosh, M.H., Misra, S., Jackson, T.J., Crow, W.T., Chan, S., Bindlish, R., Chae, C., Collins, C.H., Yueh, S.H., 2017. Validation and scaling of soil moisture in a semi-arid environment: SMAP validation experiment 2015 (SMAPVEX15). Remote Sens Environ 196, 101–112. https://doi.org/10.1016/j.rse.2017.04.022. Crisóstomo de Castro Filho, H., Abílio de Carvalho Júnior, O., Ferreira de Carvalho, O.L., Pozzobon de Bem, P., dos Santos de Moura, R., Olino de Albuquerque, A., Rosa Silva, C., Guimaraes Ferreira, P.H., Fontes Guimarães, R., Trancoso Gomes, R.A., 2020. Rice crop detection using LSTM, Bi-LSTM, and machine learning models from Sentinel-1 time series. Remote Sens 12, 2655. https://doi.org/10.3390/rs12162655. Cué La Rosa, L.E., Happ, P.N., Feitosa, R.Q., 2018. Dense fully convolutional networks for crop recognition from multitemporal SAR image sequences, in: IGARSS 2018-2018 IEEE International Geoscience 7460–7463. https://doi.org/10.1109/IGARSS.2018.8517995. Symposium. Sensing Remote IEEE, and pp. Cué La Rosa, L.E., Queiroz Feitosa, R., Nigri Happ, P., Del’Arco Sanches, I., Ostwald Pedro da Costa, G.A., 2019. Combining deep learning and prior knowledge for crop mapping in tropical 11, 2029. regions https://doi.org/10.3390/rs11172029. from multitemporal SAR sequences. Remote Sens image Cué La Rosa, L.E., Oliveira, D.A.B., Ghamisi, P., 2023. Learning crop type mapping from regional label proportions in large-scale SAR and optical imagery. IEEE Transactions on Geoscience and Remote Sensing. https://doi.org/10.1109/TGRS.2023.3321156. Dadhwal, V.K., 2003. Crop growth and productivity monitoring and simulation using remote sensing and GIS. Satellite remote sensing and GIS applications in agricultural meteorology 263– 289. https://www.preventionweb.net/files/1682_9970.pdf#page=263 Dalsasso, E., Denis, L., & Tupin, F. (2021a). As if by magic: self-supervised training of deep despeckling networks with MERLIN. IEEE Transactions on Geoscience and Remote Sensing, 60, 1–13. https://doi.org/10.1109/TGRS.2021.3128621. 82 Dalsasso, E., Denis, L., & Tupin, F. (2021b). SAR2SAR: A semi-supervised despeckling algorithm for SAR images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14, 4321–4329. https://doi.org/10.1109/JSTARS.2021.3071864. Dalsasso, E., Yang, X., Denis, L., Tupin, F., & Yang, W. (2020). SAR image despeckling by deep neural networks: From a pre-trained model to an end-to-end training strategy. Remote Sensing, 12(16), 2636. https://doi.org/10.3390/rs12162636. de Albuquerque, A.O., de Carvalho Júnior, O.A., Carvalho, O.L.F. de, de Bem, P.P., Ferreira, P.H.G., de Moura, R. dos S., Silva, C.R., Trancoso Gomes, R.A., Fontes Guimarães, R., 2020. Deep semantic segmentation of center pivot irrigation systems from remotely sensed data. Remote Sens 12, 2159. https://doi.org/10.3390/rs12132159. de Albuquerque, A.O., de Carvalho, O.L.F., e Silva, C.R., de Bem, P.P., Gomes, R.A.T., Borges, D.L., Guimarães, R.F., Pimentel, C.M.M., de Carvalho Júnior, O.A., 2021. Instance segmentation of center pivot irrigation systems using multi-temporal SENTINEL-1 SAR images. Remote Sens Appl 23, 100537. https://doi.org/10.1016/j.rsase.2021.100537. De Notaris, C., Rasmussen, J., Sørensen, P., Olesen, J.E., 2018. Nitrogen leaching: A crop rotation perspective on the effect of N surplus, field management and use of catch crops. Agric Ecosyst Environ 255, 1–11. https://doi.org/10.1016/j.agee.2017.12.009. De Vroey, M., Radoux, J., Defourny, P., 2021. Grassland mowing detection using sentinel-1 time series: potential and limitations. Remote Sens 13, 348. https://doi.org/10.3390/rs13030348. Desai, G., Gaikwad, A., 2021. Deep Learning Techniques for Crop Classification Applied to SAR Imagery: A Survey, in: 2021 Asian Conference on Innovation in Technology (ASIANCON). IEEE, pp. 1–6. https://doi.org/10.1109/ASIANCON51346.2021.9544707. Deschamps, B., McNairn, H., Shang, J., Jiao, X., 2012. Towards operational radar-only crop type classification: comparison of a traditional decision tree with a random forest classifier. Canadian Journal of Remote Sensing 38, 60–68. https://doi.org/10.5589/m12-012. Di Martino, T., Guinvarc’h, R., Thirion-Lefevre, L., & Colin, E. (2022). FARMSAR: Fixing AgRicultural Mislabels Using Sentinel-1 Time Series and AutoencodeRs. Remote Sensing, 15(1), 35. http://dx.doi.org/10.3390/rs15010035. Di Martino, T., Guinvarc’h, R., Thirion-Lefevre, L., & Koeniguer, E. C. (2021). Beets or cotton? blind extraction of fine agricultural classes using a convolutional autoencoder applied to temporal sar signatures. IEEE Transactions on Geoscience and Remote Sensing, 60, 1–18. https://doi.org/10.1109/TGRS.2021.3100637. Dobson, M.C., Ulaby, F.T., Hallikainen, M.T., El-Rayes, M.A., 1985. Microwave dielectric behavior of wet soil-Part II: Dielectric mixing models. IEEE Transactions on geoscience and remote sensing 35–46. https://doi.org/10.1109/TGRS.1985.289498. Dong, J., Xiao, X., Menarguez, M.A., Zhang, G., Qin, Y., Thau, D., Biradar, C., Moore III, B., 2016. Mapping paddy rice planting area in northeastern Asia with Landsat 8 images, phenology- 83 based algorithm and Google Earth Engine. Remote Sens Environ 185, 142–154. https://doi.org/10.1016/j.rse.2016.02.016. Drusch, M., Del Bello, U., Carlier, S., Colin, O., Fernandez, V., Gascon, F., Hoersch, B., Isola, C., Laberinti, P., Martimort, P., 2012. Sentinel-2: ESA’s optical high-resolution mission for GMES operational services. Remote Sens Environ 120, 25–36. https://doi.org/10.1016/j.rse.2011.11.026. Dupuis, A., Dadouchi, C., Agard, B., 2023. Methodology for multi-temporal prediction of crop rotations using recurrent neural networks. Smart Agricultural Technology 4, 100152. https://doi.org/10.1016/j.atech.2022.100152. Erten, E., Lopez-Sanchez, J. M., Yuzugullu, O., & Hajnsek, I. (2016). Retrieval of agricultural crop height from space: A comparison of SAR techniques. Remote Sensing of Environment, 187, 130–144. https://doi.org/10.1016/j.rse.2016.10.007 Fang, B., Lakshmi, V., 2014. Passive/active microwave soil moisture retrieval disaggregation using SMAPVEX12 data, in: Land Surface Remote Sensing II. SPIE, pp. 83–92. https://doi.org/10.1117/12.2064441. Feng, X., Zhou, J., Li, M., Wang, X., Long, J., 2022. The use of ALSTM-FCN for tobacco planting extraction from time-series Sentinel-1A Sar images, in: 2022 29th International Conference on Geoinformatics. IEEE, pp. 1–5. https://doi.org/10.1109/Geoinformatics57846.2022.9963795 Fernandez-Beltran, R., Baidar, T., Kang, J., Pla, F., 2021. Rice-yield prediction with multi- temporal Sentinel-2 data and 3D CNN: A case study in Nepal. Remote Sens 13, 1391. https://doi.org/10.3390/rs13071391. Ferrazzoli, P., Paloscia, S., Pampaloni, P., Schiavon, G., Sigismondi, S., Solimini, D., 1997. The potential of multifrequency polarimetric SAR in assessing agricultural and arboreous biomass. IEEE 5–17. Transactions https://doi.org/10.1109/36.551929. Geoscience Sensing Remote and 35, on Fieuzal, R., Sicre, C.M., Baup, F., 2017. Estimation of corn yield using multi-temporal optical and radar satellite data and artificial neural networks. International journal of applied earth observation and geoinformation 57, 14–23. https://doi.org/10.1016/j.jag.2016.12.011. Fontanelli, G., Lapini, A., Santurri, L., Pettinato, S., Santi, E., Ramat, G., Pilia, S., Baroni, F., Tapete, D., Cigna, F., 2022. Early-Season Crop Mapping on an Agricultural Area in Italy Using X-Band Dual-Polarization SAR Satellite Data and Convolutional Neural Networks. IEEE J Sel Top 6789–6803. https://doi.org/10.1109/JSTARS.2022.3198475. Remote Earth Appl Sens Obs 15, Freeman, A., Durden, S.L., 1998. A three-component scattering model for polarimetric SAR data. IEEE 963–973. https://doi.org/10.1109/36.673687. transactions geoscience sensing remote and 36, on Freudenberg, M., Nölke, N., Agostini, A., Urban, K., Wörgötter, F., Kleinn, C., 2019. Large scale palm tree detection in high resolution satellite images using U-Net. Remote Sens 11, 312. 84 https://doi.org/10.3390/rs11030312 Fuller, A., Millard, K., Green, J.R., 2023. CROMA: Remote Sensing Representations with Contrastive Radar-Optical Masked Autoencoders. https://doi.org/10.48550/arXiv.2311.00566. Gao, F., Anderson, M.C., Zhang, X., Yang, Z., Alfieri, J.G., Kustas, W.P., Mueller, R., Johnson, D.M., Prueger, J.H., 2017. Toward mapping crop progress at field scales through fusion of Landsat 9–25. imagery. and https://doi.org/https://doi.org/10.1016/j.rse.2016.11.004. MODIS Environ Remote Sens 188, Gargiulo, M., Dell’Aglio, D.A.G., Iodice, A., Riccio, D., Ruello, G., 2020. Integration of sentinel- land cover mapping using w-net. Sensors 20, 2969. 1 and sentinel-2 data https://doi.org/10.3390/s20102969. for Garioud, A., Valero, S., Giordano, S., Mallet, C., 2021. Recurrent-based regression of Sentinel time series for continuous vegetation monitoring. Remote Sens Environ 263, 112419. https://doi.org/10.1016/j.rse.2021.112419. Garnot, V.S.F., Landrieu, L., 2021. Panoptic segmentation of satellite image time series with convolutional temporal attention networks, in: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 4872–4881. https://doi.org/10.48550/arXiv.2107.07933. Garnot, V.S.F., Landrieu, L., Chehata, N., 2022. Multi-modal temporal attention models for crop mapping from satellite time series. ISPRS Journal of Photogrammetry and Remote Sensing 187, 294–305. https://doi.org/10.1016/j.isprsjprs.2022.03.012. Garnot, V.S.F., Landrieu, L., Giordano, S., Chehata, N., 2020. Satellite image time series classification with pixel-set encoders and temporal self-attention, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 12325–12334. https://doi.org/10.48550/arXiv.1911.07757. Gavahi, K., Abbaszadeh, P., Moradkhani, H., 2021. DeepYield: A combined convolutional neural network with long short-term memory for crop yield forecasting. Expert Syst Appl 184, 115511. https://doi.org/10.1016/j.eswa.2021.115511. Ge, J., Zhang, H., Xu, L., Sun, C., Duan, H., Guo, Z., Wang, C., 2023. A Physically Interpretable Rice Field Extraction Model Imagery. Remote Sens 15, 974. https://doi.org/10.3390/rs15040974. for PolSAR Giordano, S., Bailly, S., Landrieu, L., Chehata, N., 2020. Improved crop classification with rotation knowledge using sentinel-1 and-2 time series. Photogramm Eng Remote Sensing 86, 431– 441. https://doi.org/10.14358/PERS.86.7.431. Goodfellow, I., Bengio, Y., Courville, A., 2016. Deep http://www.deeplearningbook.org learning. The MIT Press. Graves, A., 2012. Long short-term memory. Supervised sequence labelling with recurrent neural networks, PP. 37–45. http://dx.doi.org/10.1007/978-3-642-24797-2. 85 Radiant Earth., https://beta.source.coop/radiantearth/african-crops-tanzania-01/ Great African 2023. Food Company Tanzania. [dataset] Gu, L., He, F., Yang, S., 2019. Crop classification based on deep learning in northeast China using SAR and optical imagery, in: 2019 SAR in Big Data Era (BIGSARDATA). IEEE, pp. 1–4. https://doi.org/10.1109/BIGSARDATA.2019.8858437. Guo, J., Li, H., Ning, J., Han, W., Zhang, W., Zhou, Z.-S., 2020. Feature dimension reduction using stacked sparse auto-encoders for crop classification with multi-temporal, quad-pol SAR Data. Remote Sens 12, 321. https://doi.org/10.3390/rs12020321. Guo, Z., Qi, W., Huang, Y., Zhao, J., Yang, H., Koo, V.-C., Li, N., 2022. Identification of crop type based on C-AENN using time series Sentinel-1A SAR data. Remote Sens 14, 1379. https://doi.org/10.3390/rs14061379. Hajnsek, I., Desnos, Y.-L., 2021. Polarimetric synthetic aperture radar: principles and application. Springer Nature. https://doi.org/10.1007/978-3-030-56504-6. Han, D., Wang, P., Tansey, K., Liu, J., Zhang, Y., Zhang, S., Li, H., 2022. Combining Sentinel-1 and-3 Imagery for Retrievals of Regional Multitemporal Biophysical Parameters Under a Deep Learning Framework. IEEE J Sel Top Appl Earth Obs Remote Sens 15, 6985–6998. https://doi.org/10.1109/JSTARS.2022.3200735. Han, Z., Zhang, C., Gao, L., Zeng, Z., Zhang, B., Atkinson, P.M., 2023. Spatio-temporal multi- level attention crop mapping method using time-series SAR imagery. ISPRS Journal of Photogrammetry 293–310. https://doi.org/https://doi.org/10.1016/j.isprsjprs.2023.11.016. Sensing Remote 206, and Hao, P., Di, L., Zhang, C., Guo, L., 2020. Transfer Learning for Crop classification with Cropland Data Layer data (CDL) as training samples. Science of The Total Environment 733, 138869. https://doi.org/https://doi.org/10.1016/j.scitotenv.2020.138869. Haralick, R.M., Shanmugam, K., Dinstein, I.H., 1973. Textural features for image classification. IEEE Trans Syst Man Cybern 610–621. https://doi.org/10.1109/TSMC.1973.4309314. Hashemi, M.G.Z., Abhishek, A., Jalilvand, E., Jayasinghe, S., Andreadis, K.M., Siqueira, P., Das, N.N., 2022. Assessing the impact of Sentinel-1 derived planting dates on rice crop yield modeling. International Journal of Applied Earth Observation and Geoinformation 114, 103047. https://doi.org/https://doi.org/10.1016/j.jag.2022.103047. Hashemi, M.G.Z., Tan, p., Jalilvand, E., Wilke, B., Alemohammad, H., Das, N.N., in press. Leveraging Deep Learning for High-Resolution Crop Yield Estimation with Sentinel-1 SAR Data. Comput Electron Agric. Heupel, K., Spengler, D., Itzerott, S., 2018. A Progressive Crop-Type Classification Using Multitemporal Remote Sensing Data and Phenological Information. PFG – Journal of Photogrammetry, Remote 53–69. https://doi.org/10.1007/s41064-018-0050-7. and Geoinformation Sensing Science 86, 86 Hinton, G.E., Salakhutdinov, R.R., 2006. Reducing the dimensionality of data with neural networks. Science 313, 504–507. https://doi.org/10.1126/science.1127647 Hoa, P.V., Giang, N.V., Binh, N.A., Hai, L.V.H., Pham, T.-D., Hasanlou, M., Tien Bui, D., 2019. Soil salinity mapping using SAR Sentinel-1 data and advanced machine learning algorithms: a case study at Ben Tre Province of the Mekong River Delta (Vietnam). Remote Sens 11, 128. https://doi.org/10.3390/rs11020128. Holtgrave, A.-K., Lobert, F., Erasmi, S., Röder, N., Kleinschmit, B., 2023. Grassland mowing event detection using combined optical, SAR, and weather time series. Remote Sens Environ 295, 113680. https://doi.org/https://doi.org/10.1016/j.rse.2023.113680. Hosseini, M., McNairn, H., Mitchell, S., Dingle Robertson, L., Davidson, A., Homayouni, S., 2019. Synthetic aperture radar and optical satellite data for estimating the biomass of corn. International Journal of Applied Earth Observation and Geoinformation 83, 101933. https://doi.org/https://doi.org/10.1016/j.jag.2019.101933. Hu, Y., Zeng, H., Tian, F., Zhang, M., Wu, B., Gilliams, S., Li, S., Li, Y., Lu, Y., Yang, H., 2022. An interannual transfer learning approach for crop classification in the Hetao Irrigation district, China. Remote Sens 14, 1208. https://doi.org/10.3390/rs14051208. Hu, Z., Zhang, J., Ge, Y., 2021. Handling vanishing gradient problem using artificial derivative. IEEE Access 9, 22371–22377. https://doi.org/10.1109/ACCESS.2021.3054915. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q., 2017. Densely connected convolutional networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4700–4708. https://doi.org/10.48550/arXiv.1608.06993. Huang, J., Gómez-Dans, J.L., Huang, H., Ma, H., Wu, Q., Lewis, P.E., Liang, S., Chen, Z., Xue, J.-H., Wu, Y., 2019. Assimilation of remote sensing into crop growth models: Current status and perspectives. Agric For Meteorol 276, 107609. https://doi.org/10.1016/j.agrformet.2019.06.008. Huang, X., Reba, M., Coffin, A., Runkle, B.R.K., Huang, Y., Chapman, B., Ziniti, B., Skakun, S., Kraatz, S., Siqueira, P., Torbick, N., 2021. Cropland mapping with L-band UAVSAR and 112180. Remote development https://doi.org/https://doi.org/10.1016/j.rse.2020.112180. of NISAR products. Environ Sens 253, Ienco, D., Interdonato, R., Gaetano, R., Minh, D.H.T., 2019. Combining Sentinel-1 and Sentinel- 2 Satellite Image Time Series for land cover mapping via a multi-source deep learning architecture. ISPRS 11–22. of http://dx.doi.org/10.1016/j.isprsjprs.2019.09.016. Photogrammetry Sensing Remote Journal 158, and Inoue, Y., Kurosu, T., Maeno, H., Uratsuka, S., Kozu, T., Dabrowska-Zielinska, K., Qi, J., 2002. Season-long daily measurements of multifrequency (Ka, Ku, X, C, and L) and full-polarization backscatter signatures over paddy rice field and their relationship with biological variables. Remote Sens Environ 81, 194–204. https://doi.org/https://doi.org/10.1016/S0034-4257(01)00343- 1. 87 Inoue, Y., Sakaiya, E., Wang, C., 2014. Capability of C-band backscattering coefficients from high-resolution satellite SAR sensors to assess biophysical variables in paddy rice. Remote Sens Environ 140, 257–266. https://doi.org/10.1016/j.rse.2013.09.001. Jackson, T., Bindlish, R., Van der Velde, R., 2004. SMEX02 Airborne Synthetic Aperture Radar (AIRSAR) Data, Iowa. https://nsidc.org/data/nsidc-0206/versions/1. [Dataset]. Jackson, T. and L. McKee. 2007. SMEX03 Vegetation Data: Oklahoma, Version 1. Boulder, Colorado USA. NASA National Snow and Ice Data Center Distributed Active Archive Center. https://doi.org/10.5067/A1E1EWIHPHAO. [Dataset]. Jankowska-Huflejt, H., 2006. The function of permanent grasslands in water resources protection. Journal of Water and Land Development 55–65. http://dx.doi.org/10.2478/v10025-007-0005-7 Jentsch, A., Kreyling, J., Boettcher‐Treschkow, J., Beierkuhnlein, C., 2009. Beyond gradual warming: extreme weather events alter flower phenology of European grassland and heath species. Glob Chang Biol 15, 837–849. https://doi.org/10.1111/j.1365-2486.2008.01690.x. Jiang, J., Zhang, H., Ge, J., Xu, L., Song, M., Sun, C., Wang, C., 2023. Single-Season Rice Area Mapping by Combining Multi-Temporal Polarization Decomposition Components and the Two- Stage Segmentation Method. Agriculture 14, 2. https://doi.org/10.3390/agriculture14010002. Jin, Y., Liu, X., Chen, Y., Liang, X., 2018. Land-cover mapping using Random Forest classification and incorporating NDVI time-series and texture: A case study of central Shandong. Int J Remote Sens 39, 8703–8723. http://dx.doi.org/10.1080/01431161.2018.1490976 Jo, H.-W., Koukos, A., Sitokonstantinou, V., Lee, W.-K., Kontoes, C., 2022. Towards Global Crop Maps with Transfer Learning. https://doi.org/10.48550/arXiv.2211.04755. Jo, H.-W., Lee, S., Park, E., Lim, C.-H., Song, C., Lee, H., Ko, Y., Cha, S., Yoon, H., Lee, W.-K., 2020. Deep learning applications on multitemporal SAR (Sentinel-1) image classification using confined labeled data: The case of detecting rice paddy in South Korea. IEEE Transactions on Geoscience and Remote Sensing 58, 7589–7601. https://doi.org/10.1109/TGRS.2020.2981671. Johnson, J.M., Khoshgoftaar, T.M., 2019. Survey on deep learning with class imbalance. J Big Data 6, 1–54. https://doi.org/10.1186/s40537-019-0192-5. Judge, J., Liu, P.-W., Monsiváis-Huertero, A., Bongiovanni, T., Chakrabarti, S., Steele-Dunne, S.C., Preston, D., Allen, S., Bermejo, J.P., Rush, P., DeRoo, R., Colliander, A., Cosh, M., 2021. Impact of vegetation water content information on soil moisture retrievals in agricultural regions: An analysis based on the SMAPVEX16-MicroWEX dataset. Remote Sens Environ 265, 112623. https://doi.org/https://doi.org/10.1016/j.rse.2021.112623. Jung, J., Maeda, M., Chang, A., Bhandari, M., Ashapure, A., Landivar-Bowles, J., 2021. The potential of remote sensing and artificial intelligence as tools to improve the resilience of 15–22. Opin Curr agriculture https://doi.org/https://doi.org/10.1016/j.copbio.2020.09.003. Biotechnol production systems. 70, 88 Kamilaris, A., Prenafeta-Boldú, F.X., 2018. A review of the use of convolutional neural networks in agriculture. J Agric Sci 156, 312–322. https://doi.org/10.1017/S0021859618000436. Kang, B., Li, Y., Xie, S., Yuan, Z., Feng, J., 2020. Exploring balanced feature spaces for International Conference on Learning Representations. learning, representation https://openreview.net/pdf?id=OqtLIabPTit. in: Katal, N., Rzanny, M., Mäder, P., Wäldchen, J., 2022. Deep learning in plant phenological research: A 805738. review. http://dx.doi.org/10.3389/fpls.2022.805738 systematic literature Front Plant Sci 13, Katharopoulos, A., Vyas, A., Pappas, N. and Fleuret, F., 2020, November. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention. In International conference on machine learning (pp. 5156-5165). PMLR. https://doi.org/10.48550/arXiv.2006.16236. Kattenborn, T., Leitloff, J., Schiefer, F., Hinz, S., 2021. Review on Convolutional Neural Networks (CNN) in vegetation remote sensing. ISPRS journal of photogrammetry and remote sensing 173, 24–49. https://doi.org/https://doi.org/10.1016/j.isprsjprs.2020.12.010. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., Liu, T.-Y., 2017. Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 30. https://proceedings.neurips.cc/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf. Kerner, H., Nakalembe, C., Becker-Reshef, I., 2020. Field-level crop type classification with k nearest dataset. https://doi.org/10.48550/arXiv.2004.03023. neighbors: A new Kenya smallholder baseline for a Khan, S.H., Hayat, M., Bennamoun, M., Sohel, F.A., Togneri, R., 2017. Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans Neural Netw Learn Syst 29, 3573– 3587. https://doi.org/10.1109/TNNLS.2017.2732482. Kim, S.-B., Huang, H., Liao, T.-H., Colliander, A., 2018. Estimating Vegetation Water Content and Soil Surface Roughness Using Physical Models of L-Band Radar Scattering for Soil Moisture Retrieval. Remote Sens 10, 556. https://doi.org/10.3390/rs10040556. Kim, Y., Jackson, T., Bindlish, R., Lee, H., Hong, S., 2011. Radar vegetation index for estimating the vegetation water content of rice and soybean. IEEE Geoscience and Remote Sensing Letters 9, 564–568. https://doi.org/10.1109/LGRS.2011.2174772. Komisarenko, V., Voormansik, K., Elshawi, R., Sakr, S., 2022. Exploiting time series of Sentinel- 1 and Sentinel-2 to detect grassland mowing events using deep learning with reject region. Sci Rep 12, 983. https://doi.org/10.1038/s41598-022-04932-6. Kondmann, L., Toker, A., Rußwurm, M., Camero Unzueta, A., Peressuti, D., Milcinski, G., Longépé, N., Mathieu, P.-P., Davis, T., & Marchisio, G. (2021). DENETHOR: The inter-Operable, analysis-Ready, daily crop DynamicEarthNET dataset for Harmonized, monitoring from space. 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 1–13. https://openreview.net/pdf?id=uUa4jNMLjrL. 89 Kondmann, L., Boeck, S., Bonifacio, R., Zhu, X.X., 2022. Early Crop Type Classification With Satellite Analysis. Imagery-An https://pml4dc.github.io/iclr2022/pdf/PML4DC_ICLR2022_3.pdf. Empirical Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25. https://doi.org/10.1145/3065386. Kussul, N., Lavreniuk, M., Skakun, S., Shelestov, A., 2017. Deep learning classification of land cover and crop types using remote sensing data. IEEE Geoscience and Remote Sensing Letters 14, 778–782. https://doi.org/10.1109/LGRS.2017.2681128. Kussul, N., Mykola, L., Shelestov, A., Skakun, S., 2018. Crop inventory at regional scale in Ukraine: developing in season and end of season crop maps with multi-temporal optical and SAR satellite 627–636. https://doi.org/10.1080/22797254.2018.1454265. imagery. Remote Sens Eur 51, J Lapini, A., Fontanelli, G., Pettinato, S., Santi, E., Paloscia, S., Tapete, D., Cigna, F., 2020. Application of Deep Learning to optical and SAR Images for the Classification of Agricultural Areas in Italy, in: IGARSS 2020-2020 IEEE International Geoscience and Remote Sensing Symposium. IEEE, pp. 4163–4166. https://doi.org/10.1109/IGARSS39084.2020.9323190. Lee, J.-S., Pottier, E., 2017. Polarimetric radar imaging: from basics to applications. CRC press. https://doi.org/10.1201/9781420054989. Li, H., Lu, J., Tian, G., Yang, H., Zhao, J., Li, N., 2022. Crop classification based on GDSSM- CNN using multi-temporal RADARSAT-2 SAR with limited labeled data. Remote Sens 14, 3889. https://doi.org/10.3390/rs14163889. Li, H., Zhang, C., Zhang, S., Ding, X., Atkinson, P.M., 2021. Iterative Deep Learning (IDL) for agricultural landscape classification using fine spatial resolution remotely sensed imagery. International Journal of Applied Earth Observation and Geoinformation 102, 102437. https://doi.org/10.1016/j.jag.2021.102437. Li, J., Li, C., Xu, W., Feng, H., Zhao, F., Long, H., Meng, Y., Chen, W., Yang, H., Yang, G., 2022. Fusion of optical and SAR images based on deep learning to reconstruct vegetation NDVI time series in cloud-prone regions. International Journal of Applied Earth Observation and Geoinformation 112, 102818. https://doi.org/10.1016/j.jag.2022.102818. Li, K., Zhao, W., Peng, R., Ye, T., 2022. Multi-branch self-learning Vision Transformer (MSViT) for crop type mapping with Optical-SAR time-series. Comput Electron Agric 203, 107497. https://doi.org/10.1016/j.compag.2022.107497 Liao, C., Wang, J., Xie, Q., Baz, A. al, Huang, X., Shang, J., He, Y., 2020. Synergistic use of multi-temporal RADARSAT-2 and VENµS data for crop classification based on 1D convolutional neural network. Remote Sens 12, 832. https://doi.org/10.3390/rs12050832. Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S., 2017. Feature pyramid networks for object detection, in: Proceedings of the IEEE Conference on Computer Vision and 90 Pattern Recognition. pp. 2117–2125. https://doi.org/10.1109/CVPR.2017.106. Lin, Z., Zhong, R., Xiong, X., Guo, C., Xu, Jinfan, Zhu, Y., Xu, Jialu, Ying, Y., Ting, K.C., Huang, J., 2022. Large-scale rice mapping using multi-task spatiotemporal deep learning and Sentinel-1 SAR time series. Remote Sens 14, 699. https://doi.org/10.3390/rs14030699. Liu, J., Li, M., Wang, X., Feng, X., Zhou, J., Zhang, H., 2023. Early Identification of Tobacco Fields Based on Sentinel-1 SAR Images, in: 2023 11th International Conference on Agro- Geoinformatics. http://dx.doi.org/10.1109/Agro- Geoinformatics59224.2023.10233334. IEEE, 1–5. pp. Liu, Y., Zhao, W., Chen, S., Ye, T., 2021. Mapping crop rotation by using deeply synergistic optical and SAR time series. Remote Sens 13, 4160. https://doi.org/10.3390/rs13204160. Lobert, F., Löw, J., Schwieder, M., Gocht, A., Schlund, M., Hostert, P., Erasmi, S., 2023. A deep learning approach for deriving winter wheat phenology from optical and SAR time series at field level. Remote Sens Environ 298, 113800. https://doi.org/10.1016/j.rse.2023.113800. Long, J., Shelhamer, E., Darrell, T., 2015. Fully convolutional networks for semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3431–3440. https://doi.org/10.1109/CVPR.2015.7298965. Lopez-Sanchez, J.M., Ballester-Berman, J.D., 2009. Potentials of polarimetric SAR interferometry for agriculture monitoring. Radio Sci 44, 1–20. https://doi.org/10.1029/2008RS004078. Luo, J., Lv, Y., Guo, J., 2022. Multi-temporal PolSAR Image Classification Using F-SAE-CNN, IEEE, pp. 1–5. in: 2022 3rd China http://dx.doi.org/10.1109/CISS57580.2022.9971318. International SAR Symposium (CISS). M Rustowicz, R., Cheong, R., Wang, L., Ermon, S., Burke, M., Lobell, D., 2019. Semantic segmentation of crop type in Africa: A novel dataset and analysis of deep learning methods, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 75–82. http://openaccess.thecvf.com/content_CVPRW_2019/papers/cv4gc/Rustowicz_Semantic_Segme ntation_of_Crop_Type_in_Africa_A_Novel_Dataset_CVPRW_2019_paper.pdf pp. Ma, C., Zhang, H.H., Wang, X., 2014. Machine learning for Big Data analytics in plants. Trends Plant Sci 19, 798–808. https://doi.org/https://doi.org/10.1016/j.tplants.2014.08.004. Ma, L., Liu, Y., Zhang, X., Ye, Y., Yin, G., Johnson, B.A., 2019. Deep learning in remote sensing applications: A meta-analysis and review. ISPRS journal of photogrammetry and remote sensing 152, 166–177. https://doi.org/10.1016/j.isprsjprs.2019.04.015. Ma, X., Huang, Z., Zhu, S., Fang, W., Wu, Y., 2022. Rice Planting Area Identification Based on Multi-Temporal Sentinel-1 SAR Images and an Attention U-Net Model. Remote Sens 14. https://doi.org/10.3390/rs14184573. Magalhães, I.A.L., de Carvalho Júnior, O.A., de Carvalho, O.L.F., de Albuquerque, A.O., 91 Hermuche, P.M., Merino, É.R., Gomes, R.A.T., Guimarães, R.F., 2022. Comparing Machine and Deep Learning Methods for the Phenology-Based Classification of Land Cover Types in the Amazon Biome Using 4858. https://doi.org/10.3390/rs14194858. Sentinel-1 Time Series. Remote Sens 14, Mandal, D., Bhattacharya, A., Rao, Y.S., 2021. Radar remote sensing for crop biophysical parameter estimation. Springer. http://dx.doi.org/10.1007/978-981-16-4424-5 Mandal, D., Kumar, V., Ratha, D., Dey, S., Bhattacharya, A., Lopez-Sanchez, J.M., McNairn, H., Rao, Y.S., 2020. Dual polarimetric radar vegetation index for crop growth monitoring using sentinel-1 111954. https://doi.org/10.1016/j.rse.2020.111954. Environ Remote SAR data. Sens 247, Martinez, J.A.C., La Rosa, L.E.C., Feitosa, R.Q., Sanches, I.D., Happ, P.N., 2021. Fully convolutional recurrent networks for multidate crop recognition from multitemporal image ISPRS Journal of Photogrammetry and Remote Sensing 171, 188–201. sequences. http://dx.doi.org/10.1016/j.isprsjprs.2020.11.007. Mas, J.F., Flores, J.J., 2008. The application of artificial neural networks to the analysis of remotely sensed data. Int J Remote Sens 29, 617–663. https://doi.org/10.1080/01431160701352154. Mascolo, L., Lopez-Sanchez, J.M., Vicente-Guijalba, F., Mazzarella, G., Nunziata, F., Migliaccio, M., 2015. Retrieval of phenological stages of onion fields during the first year of growth by means of C-band polarimetric SAR measurements. Int J Remote Sens 36, 3077–3096. https://doi.org/10.1080/01431161.2015.1055608. McDonald, A.J., Bennett, J.C., Cookmartin, G., Crossley, S., Morrison, K., Quegan, S., 2000. The effect of leaf geometry on the microwave backscatter from leaves. Int J Remote Sens 21, 395–400. https://doi.org/10.1080/014311600210911. McNairn, H., Kross, A., Lapen, D., Caves, R., Shang, J., 2014. Early season monitoring of corn and soybeans with TerraSAR-X and RADARSAT-2. International Journal of Applied Earth Observation and Geoinformation 28, 252–259. http://dx.doi.org/10.1016/j.jag.2013.12.015. McNairn, H., Shang, J., Jiao, X., Champagne, C., 2009. The contribution of ALOS PALSAR multipolarization and polarimetric data to crop classification. IEEE Transactions on Geoscience and Remote Sensing 47, 3981–3992. http://dx.doi.org/10.1109/TGRS.2009.2026052. Mei, X., Nie, W., Liu, J., Huang, K., 2018. PolSAR image crop classification based on deep residual learning network, in: 2018 7th International Conference on Agro-Geoinformatics. IEEE, pp. 1–6. http://dx.doi.org/10.1109/Agro-Geoinformatics.2018.8476061. Mercier, A., Betbeder, J., Rapinel, S., Jegou, N., Baudry, J., Hubert-Moy, L., 2020. Evaluation of Sentinel-1 and-2 time series for estimating LAI and biomass of wheat and rapeseed crop types. J Appl Remote Sens 14, 24512. http://dx.doi.org/10.1117/1.JRS.14.024512. Meraoumia, I., Dalsasso, E., Denis, L., Abergel, R., & Tupin, F. (2023). Multitemporal speckle reduction with self-supervised deep neural networks. IEEE Transactions on Geoscience and 92 Remote Sensing, 61, 1–14. https://doi.org/10.1109/TGRS.2023.3237466. Mestre-Quereda, A., Lopez-Sanchez, J.M., Vicente-Guijalba, F., Jacob, A.W., Engdahl, M.E., 2020. Time-series of Sentinel-1 interferometric coherence and backscatter for crop-type mapping. 4070–4084. IEEE http://dx.doi.org/10.1109/JSTARS.2020.3008096. Earth Obs Top Appl Remote Sens Sel 13, J Metternicht, G.I., Zinck, J.A., 2003. Remote sensing of soil salinity: potentials and constraints. Remote Sens Environ 85, 1–20. https://doi.org/10.1016/S0034-4257(02)00188-8. Metzler, C.A., Ikoma, H., Peng, Y., Wetzstein, G., 2020. Deep optics for single-shot high- dynamic-range imaging, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 1375–1385. https://doi.org/10.48550/arXiv.1908.00620. Mikołajczyk, A., Grochowski, M., 2019. Style transfer-based image synthesis as an efficient regularization technique in deep learning, in: 2019 24Th International Conference on Methods and Models 42–47. Robotics https://doi.org/10.48550/arXiv.1905.10974. Automation (MMAR). IEEE, and pp. in Minh, D.H.T., Ienco, D., Gaetano, R., Lalande, N., Ndikumana, E., Osman, F., Maurel, P., 2018. Deep recurrent neural networks for winter vegetation quality mapping via multitemporal SAR Sentinel-1. 464–468. https://doi.org/10.48550/arXiv.1708.03694. IEEE Geoscience and Remote Sensing Letters 15, Mirzaei, A., Bagheri, H., Khosravi, I., 2023. Enhancing Crop Classification Accuracy through Synthetic SAR-Optical Data Generation Using Deep Learning. ISPRS Int J Geoinf 12, 450. https://doi.org/10.3390/ijgi12110450. Mohan, E., Rajesh, A., Sunitha, G., Konduru, R.M., Avanija, J., Ganesh Babu, L., 2021. A deep neural network learning‐based speckle noise removal technique for enhancing the quality of synthetic‐aperture radar images. Concurr Comput 33, e6239. http://dx.doi.org/10.1002/cpe.6239. Mullissa, A.G., Persello, C., Tolpekin, V., 2018. Fully convolutional networks for multi-temporal SAR image classification. In IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium (pp. 6635-6638). IEEE. http://dx.doi.org/10.1109/IGARSS.2018.8518780 Muruganantham, P., Wibowo, S., Grandhi, S., Samrat, N.H., Islam, N., 2022. A systematic literature review on crop yield prediction with deep learning and remote sensing. Remote Sens 14, 1990. https://doi.org/10.3390/rs14091990. Najem, S., Baghdadi, N., Bazzi, H., Lalande, N., Bouchet, L., 2023. Detection and Mapping of Cover Crops using Sentinel-1 SAR Remote Sensing data. IEEE J Sel Top Appl Earth Obs Remote Sens http://dx.doi.org/10.1109/JSTARS.2023.3337989. Nakalembe, C.L., O.H., D.N., & K.B., 2021. 2019 Mali Crop Type Training Data for Machine Learning. Radiant MLHub. https://doi.org/10.34911/rdnt.tgz68o. NASA., 2008. Soil Moisture Active Passive Validation Experiment 2008 (SMAPVEX08) In Situ 93 Vegetation Data. Version 1. https://nsidc.org/data/sv08v/versions/1. [Dataset]. NASA., 2016a. Soil Moisture Active Passive Validation Experiment 2016 (SMAPVEX16) Iowa PALS 1. https://nsidc.org/data/sv08v/versions/1. [Dataset]. Soil Moisture Temperature Brightness Version Data. and NASA., 2016b. Soil Moisture Active Passive Validation Experiment 2016 (SMAPVEX16) Manitoba In Situ Vegetation Data. Version 1. https://nsidc.org/data/sv16m_v/versions/1. [Dataset]. Nasirzadehdizaji, R., Balik Sanli, F., Abdikan, S., Cakir, Z., Sekertekin, A., Ustuner, M., 2019. Sensitivity analysis of multi-temporal Sentinel-1 SAR parameters to crop height and canopy coverage. Applied Sciences 9, 655. https://doi.org/10.3390/app9040655. Navacchi, C., Cao, S., Bauer-Marschallinger, B., Snoeij, P., Small, D., Wagner, W., 2022. Utilising Sentinel-1’s orbital stability for efficient pre-processing of sigma nought backscatter. ISPRS 130–141. of https://doi.org/10.1016/j.isprsjprs.2022.07.023. Photogrammetry and Remote Sensing Journal 192, Navarro, A., Rolim, J., Miguel, I., Catalão, J., Silva, J., Painho, M., & Vekerdy, Z. (2016). Crop monitoring based on SPOT-5 Take-5 and sentinel-1A data for the estimation of crop water requirements. Remote Sensing, 8(6), 525. https://doi.org/10.3390/rs8060525. Ndikumana, E., Ho Tong Minh, D., Baghdadi, N., Courault, D., Hossard, L., 2018. Deep recurrent neural network for agricultural classification using multitemporal SAR Sentinel-1 for Camargue, France. Remote Sens 10, 1217. https://doi.org/10.3390/rs10081217. Ngo, T.X., Bui, N.B., Phan, H.D.T., Ha, H.M., Nguyen, T.T.N., 2023. Paddy rice mapping in Red River Delta, Vietnam, using Sentinel 1/2 data and machine learning algorithms. J Spat Sci 1–17. http://dx.doi.org/10.1080/14498596.2023.2174196. Nguyen, D.B., Gruber, A., Wagner, W., 2016. Mapping rice extent and cropping scheme in the Mekong Delta using Sentinel-1A data. Remote Sensing Letters 7, 1209–1218. http://dx.doi.org/10.1080/2150704X.2016.1225172. Ni, J., López-Martínez, C., Hu, Z., Zhang, F., 2022. Multitemporal SAR and Polarimetric SAR Optimization and Classification: Reinterpreting Temporal Coherence. IEEE Transactions on Geoscience and Remote Sensing 60, pp. 1–17. https://doi.org/10.1109/tgrs.2022.3214097 Nurmemet, I., Sagan, V., Ding, J.-L., Halik, Ü., Abliz, A., Yakup, Z., 2018. A WFS-SVM model for soil salinity mapping in keriya oasis, northwestern china using polarimetric decomposition and fully PolSAR data. Remote Sens 10, 598. http://dx.doi.org/10.3390/rs10040598. Ofori-Ampofo, S., Pelletier, C., Lang, S., 2021. Crop type mapping from optical and radar time series 4668. attention-based https://doi.org/10.3390/rs13224668. learning. Remote using Sens deep 13, Oktay, O., Schlemper, J., Folgoc, L. Le, Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, 94 S., Hammerla, N.Y., Kainz, B., 2018. Attention u-net: Learning where to look for the pancreas. https://doi.org/10.48550/arXiv.1804.03999. Olimov, B., Subramanian, B., Ugli, R.A.A., Kim, J.-S., Kim, J., 2023. Consecutive multiscale 3595. feature https://doi.org/10.1038/s41598-023-30480-8. classification model. learning-based image Rep Sci 13, Oquab, M., Bottou, L., Laptev, I., Sivic, J., 2014. Learning and transferring mid-level image representations using convolutional neural networks, in: Proceedings of the IEEE Conference on Computer 1717–1724. Vision https://doi.org/10.1109/CVPR.2014.222. Recognition. Pattern and pp. Oveis, A.H., Giusti, E., Ghio, S., Martorella, M., 2022. A Survey on the Applications of Convolutional Neural Networks for Synthetic Aperture Radar: Recent Advances. IEEE Aerospace and Electronic Systems Magazine 37, 18–42. https://doi.org/10.1109/MAES.2021.3117369. Pacheco, A.M., McNairn, H., Merzouki, A., 2010. Evaluating TerraSAR-X for the identification of tillage occurrence over an agricultural area in Canada, in: Remote Sensing for Agriculture, Ecosystems, and Hydrology XII. SPIE, pp. 156–162. http://dx.doi.org/10.1117/12.868218. Pandžić, M., Pavlović, D., Matavulj, P., Brdar, S., Marko, O., Crnojević, V., Kilibarda, M., 2024. Interseasonal transfer learning for crop mapping using Sentinel-1 data. International Journal of 103718. Applied https://doi.org/https://doi.org/10.1016/j.jag.2024.103718. Geoinformation Observation Earth 128, and Park, J., Johnson, J.T., Majurec, N., Niamsuwan, N., Piepmeier, J.R., Mohammed, P.N., Ruf, C.S., Misra, S., Yueh, S.H., Dinardo, S.J., 2011. Airborne L-band radio frequency interference observations from the SMAPVEX08 campaign and associated flights. IEEE Transactions on Geoscience and Remote Sensing 49, 3359–3370. http://dx.doi.org/10.1109/TGRS.2011.2107560. Paul, S., Kumari, M., Murthy, C.S., Nagesh Kumar, D., 2022. Generating pre-harvest crop maps by applying convolutional neural network on multi-temporal Sentinel-1 data. Int J Remote Sens 43, 6078–6101. http://dx.doi.org/10.1080/01431161.2022.2030072. Periasamy, S., 2018. Significance of dual polarimetric synthetic aperture radar in biomass retrieval: An 537–549. https://doi.org/https://doi.org/10.1016/j.rse.2018.09.003. on Sentinel-1. Remote Sens Environ attempt 217, Phan, H., Le Toan, T., Bouvet, A., Nguyen, L.D., Pham Duy, T., Zribi, M., 2018. Mapping of rice varieties 316. date https://doi.org/10.3390/s18010316. using X-band Sensors sowing SAR data. and 18, Pires de Lima, R., Marfurt, K., 2019. Convolutional neural network for remote-sensing scene classification: 86. https://doi.org/10.3390/rs12010086. analysis. Transfer learning Remote Sens 12, PlantVillage, 2019. PlantVillage Kenya Ground Reference Crop Type Dataset. Version 1.0. Radiant MLHub. https://doi.org/10.34911/RDNT.U41J87. [Dataset]. 95 Prexl, J., Schmitt, M., 2023. Multi-Modal Multi-Objective Contrastive Learning for Sentinel-1/2 Imagery, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2135–2143. https://doi.org/10.3390/rs15164102. Qiao, M., He, X., Cheng, X., Li, P., Luo, H., Tian, Z., Guo, H., 2021. Exploiting hierarchical features for crop yield prediction based on 3-D convolutional neural networks and multikernel Gaussian process. IEEE J Sel Top Appl Earth Obs Remote Sens 14, 4476–4489. http://dx.doi.org/10.1109/JSTARS.2021.3073149. Qu, Y., Zhao, W., Yuan, Z., Chen, J., 2020. Crop mapping from sentinel-1 polarimetric time-series with a deep neural network. Remote Sens 12, 2493. https://doi.org/10.3390/rs12152493. Quast, R., Wagner, W., Bauer-Marschallinger, B., Vreugdenhil, M., 2023. Soil moisture retrieval from Sentinel-1 using a first-order radiative transfer model—A case-study over the Po-Valley. Remote Sens Environ 295, 113651. https://doi.org/10.1016/j.rse.2023.113651. Radiant Earth Foundation & IDinsight., 2022. AgriFieldNet Competition Dataset, Version 1.0. Radiant MLHub. https://beta.source.coop/radiantearth/agrifieldnet-competition/. [Dataset]. Raney, R.K., Cahill, J.T.S., Patterson, G.W., Bussey, D.B.J., 2012. The m‐chi decomposition of hybrid dual‐polarimetric radar data with application to lunar craters. J Geophys Res Planets 117. http://dx.doi.org/10.1029/2011JE003986. Reicosky, D.C., Forcella, F., 1998. Cover crop and soil quality interactions in agroecosystems. J Soil pp.224-229. https://link.gale.com/apps/doc/A21170093/AONE?u=anon~bc52b7&sid=googleScholar&xid=f1 94559d Conserv Water 53(3), Reinermann, S., Gessner, U., Asam, S., Ullmann, T., Schucknecht, A., Kuenzer, C., 2022. Detection of grassland mowing events for Germany by combining Sentinel-1 and Sentinel-2 time series. Remote Sens 14, 1647. https://doi.org/10.3390/rs14071647. Remelgado, R., Zaitov, S., Kenjabaev, S., Stulina, G., Sultanov, M., Ibrakhimov, M., Akhmedov, M., Dukhovny, V., Conrad, C., 2020. A crop type dataset for consistent land cover classification in Central Asia. Sci Data 7, 250. https://doi.org/10.1038/s41597-020-00591-2. Rezaee, M., Mahdianpari, M., Zhang, Y., Salehi, B., 2018. Deep convolutional neural network for complex wetland classification using optical remote sensing imagery. IEEE J Sel Top Appl Earth Obs Remote Sens 11, 3030–3039. http://dx.doi.org/10.1109/JSTARS.2018.2846178 Richardson, A.D., Keenan, T.F., Migliavacca, M., Ryu, Y., Sonnentag, O., Toomey, M., 2013. Climate change, phenology, and phenological control of vegetation feedbacks to the climate system. 156–173. https://doi.org/https://doi.org/10.1016/j.agrformet.2012.09.012 Meteorol Agric 169, For Rineer, J., Beach, R., Lapidus, D., O’Neil, M., Temple, D., Ujeneza, N., Cajka, J., & Chew, R. (2021). Drone imagery classification training dataset for crop types in Rwanda. Version, 1. https://radiantearth.blob.core.windows.net/mlhub/rti-rwanda-crop- [Dataset]. 96 type/documentation.pdf. Rogozinski, M., Martinez, J.A.C., Feitosa, R.Q., 2022. 3D convolution for multidate crop recognition from multitemporal image sequences. Int J Remote Sens 43, 6056–6077. http://dx.doi.org/10.1080/01431161.2021.1976876. Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: Convolutional networks for biomedical image segmentation, in: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, pp. 234–241. https://doi.org/10.48550/arXiv.1505.04597. Rosen, P.A., Hensley, S., Joughin, I.R., 2000. F. k. Li, SN Madsen, E. Rodrıguez et R. Goldstein: Synthetic aperture IEEE 88, 333–382. interferometry. Proceedings of https://doi.org/10.1109/5.838084. radar the Roy, D.P., Wulder, M.A., Loveland, T.R., Woodcock, C.E., Allen, R.G., Anderson, M.C., Helder, D., Irons, J.R., Johnson, D.M., Kennedy, R., 2014. Landsat-8: Science and product vision for terrestrial 154–172. https://doi.org/10.1016/j.rse.2014.02.001. research. Remote Environ change global Sens 145, Russello, H., 2018. Convolutional neural networks for crop yield prediction using satellite images. IBM Center for Advanced Studies. https://api.semanticscholar.org/CorpusID:51786849. Rußwurm, M., Courty, N., Emonet, R., Lefèvre, S., Tuia, D., Tavenard, R., 2023. End-to-end learned early classification of time series for in-season crop type mapping. ISPRS Journal of Photogrammetry 445–456. Remote https://doi.org/10.1016/j.isprsjprs.2022.12.016. Sensing 196, and Rußwurm, M., Körner, M., 2018. Multi-temporal land cover classification with sequential recurrent encoders. ISPRS Int J Geoinf 7, 129. https://doi.org/10.3390/ijgi7040129. Rußwurm, M., Lefèvre, S., Körner, M., 2019. Breizhcrops: A satellite time series dataset for crop type identification, in: Proceedings of the International Conference on Machine Learning Time Series Workshop. https://doi.org/10.48550/arXiv.1905.11893. Ryu, D., Lee, S.-G., 2023. Mapping Vegetation Water Content over Agricultural Landscapes Using Satellite C-and X-Band Synthetic Aperture Radar, in: IGARSS 2023-2023 IEEE IEEE, pp. 399–402. International Geoscience http://dx.doi.org/10.1109/IGARSS52108.2023.10282451. and Remote Sensing Symposium. Saadat, M., Seydi, S.T., Hasanlou, M., Homayouni, S., 2022. A Convolutional Neural Network Method for Rice Mapping Using Time-Series of Sentinel-1 and Sentinel-2 Imagery. Agriculture 12, 2083. https://doi.org/10.3390/agriculture12122083. Sanches, I.D., Feitosa, R.Q., Diaz, P.M.A., Soares, M.D., Luiz, A.J.B., Schultz, B., Maurano, L.E.P., 2018. Campo verde database: Seeking to improve agricultural remote sensing of tropical areas. 369–373. Remote http://dx.doi.org/10.1109/LGRS.2017.2789120. Geoscience Sensing Letters IEEE and 15, 97 Schneider, M., Broszeit, A., Körner, M., 2021. Eurocrops: A pan-european dataset for time series crop type classification. https://doi.org/10.48550/arXiv.2106.08151. Schneider, M., Schelte, T., Schmitz, F., Körner, M., 2023. EuroCrops: All you need to know about the European Union. the Largest Harmonised Open Crop Dataset Across https://doi.org/10.1038/s41597-023-02517-0. Schuster, C., Ali, I., Lohmann, P., Frick, A., Förster, M., Kleinschmit, B., 2011. Towards detecting swath events in TerraSAR-X time series to establish NATURA 2000 grassland habitat swath management 1308–1322. monitoring https://doi.org/10.3390/rs3071308. parameter. Remote Sens as 3, Shang, J., Liu, J., Poncos, V., Geng, X., Qian, B., Chen, Q., Dong, T., Macdonald, D., Martin, T., Kovacs, J., 2020. Detection of crop seeding and harvest through analysis of time-series Sentinel- 1 interferometric SAR data. Remote Sens 12, 1551. https://doi.org/10.3390/rs12101551. Shang, R., Wang, J., Jiao, L., Yang, X., Li, Y., 2022. Spatial feature-based convolutional neural network 108922. http://dx.doi.org/10.1016/j.asoc.2022.108922. classification. Appl Soft Comput for PolSAR image 123, Shi, H., Sheng, Q., Wang, Y., Yue, B., Chen, L., 2022. Dynamic range compression self-adaption method learning. Remote Sens 14, 2338. https://doi.org/10.3390/rs14102338. image based on deep for SAR Shi, X., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W.-K., Woo, W., 2015. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Adv Neural Inf Process Syst 28. https://doi.org/10.48550/arXiv.1506.04214. Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. https://doi.org/10.48550/arXiv.1409.1556. Skakun, S., Kussul, N., Shelestov, A.Y., Lavreniuk, M., Kussul, O., 2015. Efficiency assessment of multitemporal C-band Radarsat-2 intensity and Landsat-8 surface reflectance satellite imagery for crop classification in Ukraine. IEEE J Sel Top Appl Earth Obs Remote Sens 9, 3712–3719. https://doi.org/10.1109/JSTARS.2015.2454297. Skriver, H., 2011. Crop classification by multitemporal C-and L-band single-and dual-polarization and fully polarimetric SAR. IEEE Transactions on Geoscience and Remote Sensing 50, 2138– 2149. http://dx.doi.org/10.1109/TGRS.2011.2172994. Skriver, H., Mattia, F., Satalino, G., Balenzano, A., Pauwels, V.R.N., Verhoest, N.E.C., Davidson, M., 2011. Crop classification using short-revisit multitemporal SAR data. IEEE J Sel Top Appl Earth Obs Remote Sens 4, 423–431. https://doi.org/10.1109/JSTARS.2011.2106198. Small, D., 2011. Flattening gamma: Radiometric terrain correction for SAR imagery. IEEE 3081–3093. and Transactions https://doi.org/10.1109/TGRS.2011.2120616. Geoscience Sensing Remote 49, on 98 Sonobe, R., Yamaya, Y., Tani, H., Wang, X., Kobayashi, N., Mochizuki, K., 2017. Assessing the suitability of data from Sentinel-1A and 2A for crop classification. GIsci Remote Sens 54, 918– 938. https://doi.org/10.1080/15481603.2017.1351149. Soudani, K., le Maire, G., Dufrêne, E., François, C., Delpierre, N., Ulrich, E., Cecchini, S., 2008. Evaluation of the onset of green-up in temperate deciduous broadleaf forests derived from Moderate Resolution Imaging Spectroradiometer (MODIS) data. Remote Sens Environ 112, 2643–2655. https://doi.org/10.1016/j.rse.2007.12.004. Soussana, J., Loiseau, P., Vuichard, N., Ceschia, E., Balesdent, J., Chevallier, T., Arrouays, D., 2004. Carbon cycling and sequestration opportunities in temperate grasslands. Soil Use Manag 20, 219–230. https://doi.org/10.1111/j.1475-2743.2004.tb00362.x. Steele-Dunne, S.C., McNairn, H., Monsivais-Huertero, A., Judge, J., Liu, P.-W., Papathanassiou, K., 2017. Radar remote sensing of agricultural canopies: A review. IEEE J Sel Top Appl Earth Obs Remote Sens 10, 2249–2273. http://dx.doi.org/10.1109/JSTARS.2016.2639043. Su, Z., Timmermans, W.J., Van Der Tol, C., Dost, R., Bianchi, R., Gómez, J.A., House, A., Hajnsek, I., Menenti, M., Magliulo, V., 2009. EAGLE 2006–Multi-purpose, multi-angle and multi-sensor in-situ and airborne campaigns over grassland and forest. Hydrol Earth Syst Sci 13, 833–845. https://doi.org/10.5194/hess-13-833-2009, 2009. Sun, C., Bian, Y., Zhou, T., Pan, J., 2019. Using of multi-source and multi-temporal remote sensing data improves crop-type mapping in the subtropical agriculture region. Sensors 19, 2401. https://doi.org/10.3390/s19102401. Sun, C., Zhang, H., Ge, J., Wang, C., Li, L., Xu, L., 2022. Rice Mapping in a Subtropical Hilly Region Based on Sentinel-1 Time Series Feature Analysis and the Dual Branch BiLSTM Model. Remote Sens 14, 3213. https://doi.org/10.3390/rs14133213. Sun, J., Di, L., Sun, Z., Shen, Y., Lai, Z., 2019. County-level soybean yield prediction using deep CNN-LSTM model. Sensors 19, 4363. https://doi.org/10.3390/s19204363. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A., 2017. Inception-v4, inception-resnet and the impact of residual connections on learning, in: Proceedings of the AAAI Conference on Artificial Intelligence. http://dx.doi.org/10.1609/aaai.v31i1.11231. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A., 2015. Going deeper with convolutions, in: Proceedings of the IEEE Conference on 1–9. https://doi.org/10.1109/CVPR.2015.7298594. Recognition. Computer Pattern Vision and pp. Tamm, T., Zalite, K., Voormansik, K., Talgre, L., 2016. Relating Sentinel-1 interferometric coherence 802. https://doi.org/10.3390/rs8100802. to mowing grasslands. Remote events Sens on 8, Team, P.F., 2022. Planet Fusion Monitoring Technical Specification, Version 1.0. 0. San Francisco, CA. https://support. planet. com/hc/en-us/articles/4406292582673-Planet-Fusion- 99 Monitoring-Technical-Specification. html. [Dataset]. Teimouri, M., Mokhtarzade, M., Baghdadi, N., Heipke, C., 2022. Fusion of time-series optical and SAR images using 3D convolutional neural networks for crop classification. Geocarto Int, 37(27), pp.15143-15160. https://doi.org/10.1080/10106049.2022.2095446 Teimouri, N., Dyrmann, M., Jørgensen, R.N., 2019. A novel spatio-temporal FCN-LSTM network for recognizing various crop types using multi-temporal radar images. Remote Sens 11, 990. https://doi.org/10.3390/rs11080990. Terliksiz, A.S., Altýlar, D.T., 2019. Use of deep neural networks for crop yield prediction: A case study of soybean yield in lauderdale county, alabama, usa, in: 2019 8th International Conference on http://dx.doi.org/10.1109/Agro- pp. Agro-Geoinformatics. Geoinformatics.2019.8820257. IEEE, 1–4. Tesfaye, A.A., Awoke, B.G., Sida, T.S., Osgood, D.E., 2022. Enhancing smallholder wheat yield prediction through sensor fusion and phenology with machine learning and deep learning methods. Agriculture 12, 1352. https://doi.org/10.3390/agriculture12091352. Thorp, K.R., Drajat, D., 2021. Deep machine learning with Sentinel satellite data to map paddy rice production stages across West Java, Indonesia. Remote Sens Environ 265, 112679. https://doi.org/10.1016/j.rse.2021.112679. Togliatti, K., Lewis-Beck, C., Walker, V.A., Hartman, T., VanLoocke, A., Cosh, M.H., Hornbuckle, B.K., 2022. Quantitative Assessment of Satellite L-Band Vegetation Optical Depth in IEEE Geoscience and Remote Sensing Letters 19, 1–5. https://doi.org/10.1109/LGRS.2020.3034174. the U.S. Corn Belt. Torres, R., Snoeij, P., Geudtner, D., Bibby, D., Davidson, M., Attema, E., Potin, P., Rommen, B., Floury, N., Brown, M. and Traver, I.N., 2012. GMES Sentinel-1 mission. Remote sensing of environment, 120, pp.9-24. http://dx.doi.org/10.1016/j.rse.2011.05.028 Tripathi, A., Tiwari, R.K., Tiwari, S.P., 2022. A deep learning multi-layer perceptron and remote sensing approach for soil health based crop yield estimation. International Journal of Applied Earth Observation and Geoinformation 113, 102959. https://doi.org/10.1016/j.jag.2022.102959. Trudel, M., Charbonneau, F., Leconte, R., 2012. Using RADARSAT-2 polarimetric and ENVISAT-ASAR dual-polarization data for estimating soil moisture over agricultural fields. Canadian Journal of Remote Sensing 38, 514–527. http://dx.doi.org/10.5589/m12-043. Tuia, D., Persello, C., Bruzzone, L., 2016. Domain adaptation for the classification of remote sensing data: An overview of recent advances. IEEE Geosci Remote Sens Mag 4, 41–57. http://dx.doi.org/10.1109/MGRS.2016.2548504. Turkoglu, M.O., D’Aronco, S., Perich, G., Liebisch, F., Streit, C., Schindler, K., Wegner, J.D., 2021. Crop mapping from image time series: Deep learning with multi-scale label hierarchies. Remote Sens Environ 264, 112603. https://doi.org/10.48550/arXiv.2102.08820. 100 Ulaby, F.T., Dubois, P.C. and Van Zyl, J., 1996. Radar mapping of surface soil moisture. Journal of hydrology, 184(1-2), pp.57-84. https://doi.org/10.1016/0022-1694(95)02968-0 Van Tricht, K., Degerickx, J., Gilliams, S., Zanaga, D., Battude, M., Grosu, A., Brombacher, J., Lesiv, M., Bayas, J. C. L., & Karanam, S. (2023). WorldCereal: a dynamic open-source system for global-scale, seasonal, and reproducible crop and irrigation mapping. Earth System Science Data Discussions, 2023, 1–36. https://doi.org/10.5194/essd-15-5491-2023. Vanschoren, J., 2018. Meta-learning: A survey. https://doi.org/10.48550/arXiv.1810.03548 Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I., 2017. Attention is all you need. Adv Neural Inf Process Syst 30. https://doi.org/10.48550/arXiv.1706.03762. Veloso, A., Mermoz, S., Bouvet, A., Le Toan, T., Planells, M., Dejoux, J.-F., Ceschia, E., 2017. Understanding the temporal behavior of crops using Sentinel-1 and Sentinel-2-like data for agricultural 415–426. https://doi.org/10.1016/j.rse.2017.07.015. applications. Environ Remote Sens 199, Vicente-Guijalba, F., Martinez-Marin, T., Lopez-Sanchez, J.M., 2014. Dynamical approach for real-time monitoring of agricultural crops. IEEE Transactions on Geoscience and Remote Sensing 53, 3278–3293. http://dx.doi.org/10.1109/TGRS.2014.2372897. Villarroya-Carpio, A., Lopez-Sanchez, J.M., Engdahl, M.E., 2022. Sentinel-1 interferometric coherence as a vegetation index for agriculture. Remote Sens Environ 280, 113208. http://dx.doi.org/10.1016/j.rse.2022.113208. Voormansik, K., Jagdhuber, T., Zalite, K., Noorma, M., Hajnsek, I., 2015. Observations of cutting practices in agricultural grasslands using polarimetric SAR. IEEE J Sel Top Appl Earth Obs Remote Sens 9, 1382–1396. http://dx.doi.org/10.1109/JSTARS.2015.2503773. Waithaka, L., Kramer, B., Hufkens, K., Kivuva, B., & Mansabdar, S. (2022) Eyes on the Ground Image [Dataset] 1.0, https://doi.org/10.34911/rdnt.1bs2jw Radiant MLHub. (Version Version Data, 1.0) Wang, D., Cao, W., Zhang, F., Li, Z., Xu, S., Wu, X., 2022. A review of deep learning in multiscale agricultural sensing. Remote Sens 14, 559. https://doi.org/10.3390/rs14030559. Wang, D., Zhang, Q., Xu, Y., Zhang, J., Du, B., Tao, D., Zhang, L., 2022. Advancing plain vision transformer toward remote sensing foundation model. IEEE Transactions on Geoscience and Remote Sensing 61, 1–15. https://doi.org/10.48550/arXiv.2208.03987. Wang, J., Wang, P., Tian, H., Tansey, K., Liu, J., Quan, W., 2023. A deep learning framework combining CNN and GRU for improving wheat yield estimates using time series remotely sensed multi-variables. 107705. https://doi.org/10.1016/j.compag.2023.107705. Electron Comput Agric 206, Wang, M., Wang, J., Chen, L., 2020. Mapping paddy rice using weakly supervised long short-term 101 memory network with time series Sentinel optical and SAR images. Agriculture 10, 483. https://doi.org/10.3390/agriculture10100483. Wang, M., Wang, J., Cui, Y., Liu, J., Chen, L., 2022. Agricultural Field Boundary Delineation with Satellite Image Segmentation for High-Resolution Crop Mapping: A Case Study of Rice Paddy. Agronomy 12. https://doi.org/10.3390/agronomy12102342. Wang, Y., Zhang, Z., Feng, L., Ma, Y., Du, Q., 2021. A new attention-based CNN approach for crop mapping using time series Sentinel-2 images. Comput Electron Agric 184, 106090. http://dx.doi.org/10.1016/j.compag.2021.106090. Wei, P., Chai, D., Lin, T., Tang, C., Du, M., Huang, J., 2021. Large-scale rice mapping under different years based on time-series Sentinel-1 images using deep semantic segmentation model. ISPRS 198–214. of https://doi.org/10.1016/j.isprsjprs.2021.02.011. photogrammetry sensing journal remote 174, and Wei, S., Zhang, H., Wang, C., Wang, Y., Xu, L., 2019. Multi-temporal SAR data large-scale crop mapping based on U-Net model. Remote Sens 11, 68. https://doi.org/10.3390/rs11010068. Weikmann, G., Paris, C., & Bruzzone, L. (2021). Timesen2crop: A million labeled samples dataset of sentinel 2 image time series for crop-type classification. IEEE Journal of Selected Topics in 4699–4708. Applied http://dx.doi.org/10.1109/JSTARS.2021.3073965. Observations Sensing, Remote Earth and 14, Weilandt, F., Behling, R., Goncalves, R., Madadi, A., Richter, L., Sanona, T., Spengler, D., Welsch, J., 2023. Early Crop Classification via Multi-Modal Satellite Data Fusion and Temporal Attention. Remote Sens 15, 799. https://doi.org/10.3390/rs15030799. Weiss, M., Jacob, F., Duveiller, G., 2020. Remote sensing for agricultural applications: A meta- 111402. review. https://doi.org/https://doi.org/10.1016/j.rse.2019.111402. Environ Remote Sens 236, Wenger, R., Puissant, A., Weber, J., Idoumghar, L., Forestier, G., 2022. Multimodal and Multitemporal Land Use/Land Cover Semantic Segmentation on Sentinel-1 and Sentinel-2 Imagery: An Application on a MultiSenGE Dataset. Remote Sens 15, 151. https://doi.org/10.3390/rs15010151. Whelen, T., Siqueira, P., 2017. Use of time-series L-band UAVSAR data for the classification of agricultural fields the San Joaquin Valley. Remote Sens Environ 193, 216–224. https://doi.org/https://doi.org/10.1016/j.rse.2017.03.014. in Xu, F., Jin, Y.-Q., 2005. Deorientation theory of polarimetric scattering targets and application to terrain surface classification. IEEE Transactions on Geoscience and Remote Sensing 43, 2351– 2364. http://dx.doi.org/10.1109/TGRS.2005.855064. Xu, L., Zhang, H., Wang, C., Wei, S., Zhang, B., Wu, F., Tang, Y., 2021. Paddy rice mapping in thailand using time-series sentinel-1 data and deep learning model. Remote Sens 13, 3994. 102 https://doi.org/10.3390/rs13193994. Xu, Y., Ma, Y., Zhang, Z., 2024. Self-supervised pre-training for large-scale crop mapping using Sentinel-2 time series. ISPRS Journal of Photogrammetry and Remote Sensing 207, 312–325. http://dx.doi.org/10.1016/j.isprsjprs.2023.12.005. Yahia, M., Ali, T., Mortula, M.M., 2020. Span statistics and their impacts on PolSAR applications. 1–5. Remote IEEE http://dx.doi.org/10.1109/LGRS.2020.3039109. Geoscience Sensing Letters and 19, Yahya, A.A., Liu, K., Hawbani, A., Wang, Y., Hadi, A.N., 2023. A Novel Image Classification Method Based on Residual Network, Inception, and Proposed Activation Function. Sensors 23, 2976. https://doi.org/10.3390/s23062976. Yamaguchi, Y., Moriyama, T., Ishido, M., Yamada, H., 2005. Four-component scattering model for polarimetric SAR image decomposition. IEEE Transactions on geoscience and remote sensing 43, 1699–1706. https://doi.org/10.1109/TGRS.2005.852084. Yang, H., Pan, B., Li, N., Wang, W., Zhang, J., Zhang, X., 2021. A systematic method for spatio- temporal phenology estimation of paddy rice using time series Sentinel-1 images. Remote Sens Environ 259, 112394. https://doi.org/10.1016/j.rse.2021.112394. Yang, J., Yamaguchi, Y., Yamada, H., Sengoku, M., Lin, S., 1998. Stable decomposition of Mueller matrix. 1261–1268. https://search.ieice.org/bin/summary.php?id=e81-b_6_1261 communications transactions IEICE 81, on Yao, J., Wu, J., Xiao, C., Zhang, Z., Li, J., 2022. The Classification Method Study of Crops Remote Sensing with Deep Learning, Machine Learning, and Google Earth Engine. Remote Sens 14, 2758. https://doi.org/10.3390/rs14122758. Yin, Q., Lin, Z., Hu, W., López-Martínez, C., Ni, J., Zhang, F., 2023. Crop classification of multi- temporal PolSAR based on 3D attention module with ViT. IEEE Geoscience and Remote Sensing Letters. http://dx.doi.org/10.1109/LGRS.2023.3270488. Yu, W., Yang, G., Li, D., Zheng, H., Yao, X., Zhu, Y., Cao, W., Qiu, L., Cheng, T., 2023. Improved prediction of rice yield at field and county levels by synergistic use of SAR, optical and meteorological 109729. https://doi.org/https://doi.org/10.1016/j.agrformet.2023.109729. Meteorol Agric data. 342, For Yuan, W., Chen, Y., Xia, J., Dong, W., Magliulo, V., Moors, E., Olesen, J.E., Zhang, H., 2016. Estimating crop yield using a satellite-based light use efficiency model. Ecol Indic 60, 702–709. https://doi.org/https://doi.org/10.1016/j.ecolind.2015.08.013. Yuan, Y., Lin, L., Zhou, Z.-G., Jiang, H., Liu, Q., 2023. Bridging optical and SAR satellite image time series via contrastive feature extraction for crop classification. ISPRS Journal of 222–232. Photogrammetry https://doi.org/https://doi.org/10.1016/j.isprsjprs.2022.11.020. Sensing Remote 195, and 103 Zebker, H.A., Villasenor, J., 1992. Decorrelation in interferometric radar echoes. IEEE Transactions on geoscience and remote sensing 30, 950–959. https://doi.org/10.1109/36.175330. Zeyada, H.H., Ezz, M.M., Nasr, A.H., Shokr, M., Harb, H.M., 2016. Evaluation of the discrimination capability of full polarimetric SAR data for crop classification. Int J Remote Sens 37, 2585–2603. http://dx.doi.org/10.1080/01431161.2016.1182663. Zhang, Q., Li, L., Sun, R., Zhu, D., Zhang, C., Chen, Q., 2020. Retrieval of the soil salinity from Sentinel-1 Dual-Polarized SAR data based on deep neural network regression. IEEE Geoscience and Remote Sensing Letters 19, 1–5. http://dx.doi.org/10.1109/LGRS.2020.3041059. Zhang, S., Li, Q., Zhang, X., Wei, K., Chen, L., Liang, W., 2012. Effects of conservation tillage on soil aggregation and aggregate binding agents in black soil of Northeast China. Soil Tillage Res 124, 196–202. http://dx.doi.org/10.1016/j.still.2012.06.007. Zhang, W., Yu, Q., Tang, H., Liu, J., Wu, W., 2024. Conservation tillage mapping and monitoring using 108705. Comput https://doi.org/10.1016/j.compag.2024.108705 Electron sensing. remote Agric 218, Zhang, W.-T., Liu, L., Bai, Y., Li, Y.-B., Guo, J., 2023. Crop classification based on multi- temporal PolSAR images with a single tensor network. Pattern Recognition 143, 109773. http://dx.doi.org/10.1016/j.patcog.2023.109773 Zhao, H., Chen, Z., Jiang, H., Jing, W., Sun, L., Feng, M., 2019. Evaluation of three deep learning models for early crop classification using sentinel-1A imagery time series—A case study in Zhanjiang, China. Remote Sens 11, 2673. https://doi.org/10.3390/rs11222673. Zhao, W., Qu, Y., Chen, J., Yuan, Z., 2020. Deeply synergistic optical and SAR time series for crop 111952. http://dx.doi.org/10.1016/j.rse.2020.111952. monitoring. dynamic Environ Remote Sens 247, Zhao, W., Qu, Y., Zhang, L., Li, K., 2022. Spatial-aware SAR-optical time-series deep integration for 113046. Remote phenology http://dx.doi.org/10.1016/j.rse.2022.113046. tracking. Environ Sens crop 276, Zheng, A., Casari, A., 2018. Feature Engineering for Machine Learning. O'Reilly Media, Inc., Sebastopol, CA. Zheng, B., Campbell, J.B., Serbin, G., Galbraith, J.M., 2014. Remote sensing of crop residue and tillage practices: Present capabilities and future prospects. Soil Tillage Res 138, 26–34. https://doi.org/10.1016/j.still.2013.12.009. Zhong, L., Hu, L., Zhou, H., 2019. Deep learning based multi-temporal crop classification. Remote Sens Environ 221, 430–443. https://doi.org/https://doi.org/10.1016/j.rse.2018.11.032. Zhou, Y., Luo, J., Feng, L., Yang, Y., Chen, Y., Wu, W., 2019. Long-short-term-memory-based crop classification using high-resolution optical images and multi-temporal SAR data. GIsci Remote Sens 56, 1170–1191. https://doi.org/10.1080/15481603.2019.1628412. 104 Zhu, X.X., Montazeri, S., Ali, M., Hua, Y., Wang, Y., Mou, L., Shi, Y., Xu, F., Bamler, R., 2021. Deep learning meets SAR: Concepts, models, pitfalls, and perspectives. IEEE Geosci Remote Sens Mag 9, 143–172. https://doi.org/10.48550/arXiv.2006.10027. 105 2. CHAPTER 2: IMPACT OF SAR PLANTING DATE ON CROP MODEL YIELD ESTIMATION 106 2.1. Introduction Rice is one of the most common sources of staple food for mankind. With 164 million hectares of rice-cultivated land in the year 2020, rice is the third most widely produced cereal in the world (FAO, 2020). Recently rice production is adversely affected globally by extreme weather events (Phung et al., 2020a). Fahad et al. (2018) found that an increase in the frequency/severity of the hot weather could decrease rice production up to 40% by the end of the 21st century. Therefore, monitoring of paddy over farms is becoming crucial for national food security. Important geo-bio-physical variables and events such as timely pre-harvest assessment of the PD, crop type detection, acreage estimation, phenology monitoring, and the harvest date determination at a fine spatial resolution are essential for near real-time decision-making at small (farm) to large (country-wide) scale for rice production. Obtaining information on these important variables and events and ingesting them into crop models, can improve the rice yield nowcast and forecast. Most of the rice crop growth models, such as the Decision Support System for Agrotechnology Transfer (DSSAT) (Jones et al., 2003), Agricultural Production Systems Simulator (APSIM) (Keating et al., 2003), CropSyst (Stöckle et al., 2003), Wageningen crop models (van Ittersum et al., 2003), and ORYZA (Bouman, 2001) used climate, soil, and ecophysiological crop parameters to simulate yields. Simulation of crop yields is affected by human decisions, forcings such as temperature (i.e., growing degree days, GDD) and precipitation rate/pattern, planting date, irrigation and fertilizer amount and timing, crop cultivars, and atmospheric CO2 (Baigorria et al., 2008). Among all these, PD has a significant impact on the crop model output (Urban et al., 2018), owing to its effect on growth duration and critical phenological phases. For rainfed paddy in particular, the PD and the length of the wet and dry season as well as the timing of precipitation can influence growth, physiology, and estimated yield (Tsimba et al., 107 2013). Several studies have shown that late sowing reduces yields whereas timely sowing increases crop production (Irwin and Hubbs, 2019; Kucharik, 2008). In almost all crop growth models, soil conditions, climatology, and water availability are used as inputs to calculate PD. Moreover, crop calendars developed from long-term PD observations provide fixed PD that do not account for the current field conditions, weather events, and socioeconomic factors that may lead to divergence of PD within the study region (Zhang et al., 2021). Hence, for a particular location or region of interest, getting information on PD data purely by any one indicator is difficult and may lead to erroneous results and adversely impact the crop modeling performance. Furthermore, the choice of cultivars with long or short durations also plays a major role in PD selection (Mandal et al., 2018). Thus, to have an optimal estimation of yield through crop models, it is necessary to have realistic near real-time information on PD. Recent advances in remote sensing have provided scientists and resource managers with the opportunity to estimate PD using near-real-time satellite observations that captures the farmer's decisions/interventions into crop models to forecast yields. In remote sensing techniques, microwave, optical, and thermal sensors have provided useful tools for estimating crop PD, detecting crop types, and monitoring phenology (Boschetti et al., 2017). Satellite-based optical sensors provide muti-spatiotemporal reflectance data that can be used to determine time-series of vegetation indices in cropland areas. However, optical/thermal imagery is susceptible to clouds, aerosols, and saturation in areas with high biomass. It is also notable that in the rainfed regions and during the paddy planting period, generally overcast conditions are observed. Thus, since the 1990s, scientists have been using and exploring microwave SAR observations at finer resolution (~10 m) with day-night and all-weather capabilities for crop monitoring. Studies have been conducted recently to evaluate the potential of SAR observations in 108 different frequency bands for rice identification and phenology detection, including the L-band (Wang et al., 2005), C-band (Canisius et al., 2018), and X-band (Küçük et al., 2016). The results showed that C-band and higher frequencies are more capable of crop attributes detection since lower frequencies penetrate through crops/vegetation and also interact with the underlying layers (soil and water) (Friesen et al., 2012). SAR backscatter carries information on structural and dielectric properties of the vegetation canopy that may be unique to each crop class, providing valuable information for phenology tracking and crop discrimination (Steele-Dunne et al., 2017). Additionally, the interaction between the SAR backscatter and the rice canopy depends on the electromagnetic polarization. A co- polarized signal like VV is particularly sensitive to the vertical orientation of the leaves (McNairn et al., 2009b), while a cross-polarization channel like VH creates a stronger correlation with the Leaf Area Index (LAI) resulting from the volume scattering within the crop canopy (McNairn and Brisco, 2004). With the launch of the Copernicus Sentinel-1 satellites in 2014 (Sentinel 1A) and 2016 (Sentinel 1B), the multi-temporal, dual-polarized, C-band backscatter images with ~10-meter spatial resolution and 6-12 days revisit time paved a new approach of mapping rice fields and tracking growth/phenology. Based on Sentinel-1 polarimetric observations, the following methods has been previously used to estimate the rice PD and growth cycles: (i) Using backscatter intensities that show unique characteristics of rice crop (flooded stage, low backscatter of water) (Yang et al., 2021), (ii) Using machine learning crop classification and phenology detection techniques (Lasko et al., 2018), and (iii) Using plant growth indices as a function of co-polarized and cross-polarized backscatter intensities to track the crop growth cycle (Mandal et al. 2020). Our study uses the Sentinel-1A backscatter time-series analysis to fairly estimate the accurate PD for rice crops within a season. We particularly focused on the assessment of SAR PD impact on 109 crop model yield estimation, as there have been no studies that examine this topic. The specific objectives of our study are as follows: (i) To map rice planting areas using analytical time-series analysis of Sentinel-1A, (ii) To determine the observed PD through tracking of the backscatter time-series, and (iii) To evaluate the impact of the SAR-derived PD on physically-based crop model performance in estimating yield. 2.2. Study area, Datasets, and Tools 2.2.1. Study Area Rainfed paddy fields in Cambodia were chosen as our study area (Figure 2.1a). As a tropical country, Cambodia has a high average temperature with dry season prior to wet rainy seasons. The rainy season is from May to October, with ~75-80% of the total annual precipitation is brought by south-westerly summer monsoon and the dry season occurs from November to April (Nguyen et al., 2010). The majority of the paddy in Cambodia is rainfed and cultivated during the wet season (Wang et al., 2017). With ~2.9 million acres of rice cultivated and ~11 million tons of production, Cambodia is ranked tenth among the global rice producers in 2020 (FAO, 2020). Seven provinces reported planting over 100,000 hectares: Prey-Veng, Battambang, Banteay-Meanchey, Takeo, Kampong-Thom, Siemreap, and Svay-Rieng (NIS, 2019). In the study, yield simulations were conducted for these provinces. In Cambodia, fertilizer use is much lower (~34.3 kg/ha in 2018) compared to the other neighboring Southeast Asian countries (World Data ATLAS). 110 Figure 2.1: a) Study area (Cambodia), the background in the map is the 30 m DEM. The rectangular tiles are the Sentinel-1A tiles showing three ascending satellite tracks covering Cambodia. The yellow points are showing the location of the selected pixels in Figure 2.3, b) Region with rice crop the year 2012 from Open Development Mekong platform (https://bit.ly/3tFPuCI). in Cambodia for 2.2.2. Datasets Sentinel-1A (SAR) Data Alaska satellite facility (ASF) provides free access to Sentinel-1A IW level 1, ground range detection (GRD) data with a 12-day repeat cycle of the C-band SAR. For this study, Sentinel-1A radiometrically terrain corrected (RTC) products generated by ASF using GAMMA software (Copernicus Sentinel data., 2022) were downloaded (from 2017-2020) for three different ascending tracks covering Cambodia (Figure 2.1a). Satellite and Ancillary Data Used for Rice Yield Estimation Table 2.1 provides information related to precipitation, temperature, wind speed, crop calendars, crop maps, soil maps, rice yield observations, and fertilizer data used in this study. This study used a crop calendar developed by the Center for Sustainability and the Global Environment 111 by compiling data on rice planting and harvesting dates from six different sources such as USDA- FAO, USDA-FAS, USDA-NASS, and IMD-AGRIMET (Sacks et al., 2010). Table 2.1: Data used for the study. Parameter Product Spatial Temporal Temporal resolution resolution coverage Precipitation Chirps 5 km daily 1981-present Temperature NCEP 1.875 deg daily 1948-present Wind speed NCEP 1.875 deg daily 1948-present Reference (Daly et al., 2008) (Kanamitsu et al., 2002) (Kanamitsu et al., 2002) Soil type SoilGrid250m 250m static 2017 Open Land map Crop Map & Calendar SAR Backscatter Fertilizer Yield Observation SAGE 18.5 km static One for all (Sacks et al., years 2010) Sentinel-1A 30 m 12 days 2016-present ASF - - One pattern static yearly world data ATLAS Province yearly 2010-2020 (GDA, 2020) 2.2.3. Rice Crop Model We implemented the Regional Hydrologic Extremes Assessment System (RHEAS) framework (Andreadis et al., 2017) over our study domain. It is a comprehensive drought and crop yield information system that loosely couples a hydrologic model (Variable Infiltration Capacity, VIC) with a crop model (Modified Decision Support System for Agrotechnology Transfer, M- 112 DSSAT) (Ines et al., 2013). RHEAS allows for the investigation of the effects of weather conditions, hydrology, climate, and farm management practices (e.g., PD, fertilizer rates) on crop yield. Using RHEAS, a regional based study has been previously carried out by Abhishek et al. (2021) that evaluated the potential implications of growing season drought on interannual variability of rice yields. Table 2.1 presents a list of the meteorological forcings, soil properties, and land cover data information incorporated into the VIC hydrologic model. The hydrological outputs from the hydrologic model (specifically, rainfall, air temperature, and net solar radiation) are channelized into the crop model to simulate the crop yield. The M-DSSAT runs in an ensemble mode to capture the variability in the agricultural system due to cultivars, soil types, fertilizer and irrigation applications, and PD. In addition, RHEAS has the capability to assimilate the surface soil moisture and LAI data from different remote sensing products. Here, we calculated the range of interannual variability of rice yields within each year among different provinces using the information from SAR-derived PD. We refer the reader to Abhishek et al. (2021) for a comprehensive and general presentation of the RHEAS model. 2.3. Methodology The methodology used in this study comprises four main steps: (1) Pre-processing of Sentinel-1A observations, (2) Time-series analysis of VH and VV Sentinel-1A backscatter for rice mapping, (3) Developing a PD retrieval algorithm using VH Sentinel-1A backscatter time-series, and (4) Yield estimation using SAR-derived PD in the RHEAS framework. 2.3.1. Pre-processing of Sentinel-1A images RTC images processed through the Gamma software in ASF were downloaded at a 30-meter resolution. We used the Sentinel-1A ascending mode for mapping rice and detecting PD in this study because of the negative impact of morning dew on SAR observations in tropical areas. Ten 113 Sentinel-1A image tiles in the ascending mode were mosaicked covering wall-to-wall over Cambodia. The time difference between the three ascending tracks (Figure 2.1) is 2 and 5 days. The assigned date to the mosaicked landscape is based on the average overpass dates of the three tracks. To reduce speckle noise in the VH and VV time-series, the SAR images were aggregated to 100 meters using bilinear interpolation (a weighted average of the four nearest neighboring pixels). Using Equation (1), the linear VH and VV time-series were converted to decibel (dB) scale. From here on, the Sentinel-1A backscatters are denoted as 𝜎0. 0 = 10 × 𝑙𝑜𝑔10(𝜎0) (1) 𝜎𝑑𝐵 Where 𝜎0 is the backscatter at linear scale and 𝜎𝑑𝐵 0 is in the logarithmic scale (dB). The incidence angle range of Sentinel-1A over Cambodia is 37° to 46. Rice paddy fields in Cambodia are primarily located in flat terrain (Figure 2.1), which minimizes the incidence angle artifacts due to terrain (Lasko et al., 2018). Moreover, in the mosaicked landscape the incidence angle and backscatter data were taken from only one granule from the overlap region without any averaging. The detailed procedure for preprocessing the Sentinel-1A data is shown in Figure 2.2. 2.3.2. Algorithm for rice mapping Cambodia includes a variety of landcover types, primarily dominated by croplands, particularly for rice agriculture, urban areas, forest, and grasslands. In this study, rice mapping was conducted using 𝜎𝑉𝐻 0 and 𝜎𝑉𝑉 0 [dB] backscatter time-series. An analytical time-series analysis algorithm was developed to differentiate between rice and the other crops/landcovers. In which, a set of statistical parameters were defined for cross-polarized 𝜎𝑉𝐻 0 observations, and vegetation growth indices were calculated with both 𝜎𝑉𝐻 0 and 𝜎𝑉𝑉 0 observations. In this study, the algorithm uses cross-polarization channel (𝜎𝑉𝐻 0) observations as the main SAR observations for both mapping rice and detecting PD. This is due to the fact that 𝜎𝑉𝑉 0 114 observations are influenced by standing water and largely attenuated by the rice crop, while the 𝜎𝑉𝐻 0 data are less affected and more representative of rice growth and plant canopy structure (Son et al., 2021). Additionally, Inoue et al. (2014a)found that there is a stronger correlation between 𝜎𝑉𝐻 0 and LAI resulting from the volume scattering within the crop canopy. In a preliminary step, the rice mapping algorithm distinguishes between crop and non-crop pixels by considering a threshold for the average and standard deviation (STD) of the 𝜎𝑉𝐻 0 time- series (Figure 2.2, process 1). For further differentiation between rice and other crops, four thresholds were defined for the minimum and maximum backscatter values (Figure 2.2, Process 2). To choose these thresholds, a sample analysis of known paddy fields taken from landcover information was conducted to identify a temporal signature of 𝜎0 values over the rice-planted areas (Figure 2.1b). There are three vegetation indices that were used to enhance rice detection precision: (i) co-pol ratio ( 𝜎𝑉𝐻 𝜎𝑉𝑉 0 0) (Veloso et al., 2017), (ii) Dual-pol Radar Vegetation Index (RVI) 0 4𝜎𝑉𝐻 0+4𝜎𝑉𝐻 (𝜎𝑉𝑉 (Nasirzadehdizaji et al. , 2019), and (iii) Dual Polarization SAR Vegetation Index 0) (DPSVI) (𝜎𝑉𝑉 0+𝜎𝑉𝐻 0 𝜎𝑉𝑉 0) (Mandal et al. , 2020). We defined thresholds for the standard deviation and average of the above indices to distinguish the paddy pixels from the non-rice pixels using countrywide sampling. The detailed rice mapping procedure is shown in Figure 2.2, Process1 and Process-2 are elaborated below in the following steps: Process-1: Differentiating between crop and non-crop, the pixels with STD (𝜎𝑉𝐻 0) less than 1 is classified as non-crop pixels. These pixels are mostly forest and urban areas. Process-2: Differentiating between rice and the other crops Process-2.1: Defining a backscatter range for 𝜎𝑉𝐻 0 variation by averaging the behavior of ~100-200 backscatter time-series over the verified paddy fields from landcovers data (Figure 2.1b). 115 The 𝜎𝑉𝐻 0 mainly varies from -12 to -24 dB over paddy fields depending on the pixel’s location in the near and far range in swath width. Due to flooding in some parts of Cambodia, the minimum 𝜎𝑉𝐻 0 may decrease to -30 dB due to the impact of specular reflection from the flooded area. Process-2.2: Defining a threshold for STD(𝜎𝑉𝐻 0/𝜎𝑉𝑉 0), Mean(𝜎𝑉𝐻 0/𝜎𝑉𝑉 0), STD(DPSVI), Mean(DPSVI), and STD(RVI) using the countrywide sampling to increase the accuracy of rice differentiation from the other crops. Process-2.3: Eliminating the pixels in which the highest rise in 𝜎𝑉𝐻 0 time-series is less than 4 dB in 24 days (two consecutive Sentinel-1A observations). 4 dB was determined by averaging the behavior of the 𝜎𝑉𝐻 0 time-series over the verified rice fields from landcovers (Figure 2.1b). 116 Figure 2.2: Methodology for identifying rice acreage, retrieving PD, and estimating rice yield. Section 2.3.2 provides details on each 0 time-series within the wet season, tp is the PD candidate, and the start time of the sorted parameter, t is the observed dates in 𝜎𝑉𝐻 slopes in Process-3. 117 2.3.3. Planting date detection By considering the unique phenology of the paddy 𝜎0 signals as well as a general time period for the growing seasons (e.g., May to August is the wet season in Cambodia), we found the local PD coinciding with the local minimum value of the 𝜎0 time-series, indicating an inundated surface. Both sowing and transplanting methods for rice planting, result in a double bounce at the beginning of the growth cycle that causes an increase in the 𝜎0 time-series. In the case of transplantation, we would experience the double bounce immediately after transplantation. For sowing, after 15-20 days, the plant height is 15-20 cm with 2-3 leaves, and this is the time when the soil is flooded with water and a sharp rise is observed in the 𝜎0 time-series. Therefore, as the beginning of the rise in 𝜎0 time-series is 12 days before the double bounce occurrence, it can be a good representative of the sowing time which is the most commonly used rice planting method (Kamoshita et al., 2016). It is notable that the water droplet on the rice leaves can also increase the radar backscatter intensity up to 2dB in co- and cross-polarized backscatter timeseries (Kabbazan et al., 2022), however this increase is negligible compared to the sharp rise due to the planting double bounce effect. Most of the rice fields in Cambodia are rainfed and no data is available on rice PD, thus a larger time window from March to August was considered in order to determine PD in various provinces. It is easier to determine the PD over the irrigated paddy fields compared to the rainfed rice fields due to the distinct seasonal rise and fall pattern in 𝜎0 time-series in irrigated rice fields. Rainfed areas are usually plowed to a depth of 70-100 mm (depending on the soil condition), and they may need to be plowed again within 3-6 weeks after the initial plowing which results in an increase in soil surface roughness. Consequently, the 𝜎0 time-series varies frequently before planting in rainfed areas, which can be mistaken as a planting signal. The 𝜎𝑉𝐻 0 time-series shown 118 in Figure 3a is from the southern part of Cambodia (next to Vietnam in which irrigated paddy field is the common practice). A pattern in the time-series indicates that irrigation is widely used in this area, making it easy to choose the local minimum or highest rise slope in the 𝜎0 time-series as the time for sowing. Conversely, in rainfed rice fields, the pattern of the 𝜎0 time-series is continuously changing (Figure 2.3b-f), and it does not have a consistent pattern, making the detection of PD very challenging. Therefore, other criteria being incorporated into the algorithm for determining the increase associated with planting. Accordingly, each sharp rise in the time-series for a specific period (wet season) was examined using several criteria such as the slope and 𝜎𝑉𝐻 0 variation between 36-60 days after sowing. Since an average overpass date was considered for the mosaicked landscape of Cambodia (section 2.3.1), the PD for each province was corrected using the Sentinel-1A track associated with the province. The detailed PD detection procedure is shown in Figure 2.2. Process-3: Estimation of planting date for the rice field map determined from Process 2 For a specific time period (March to August) all the rises in 𝜎𝑉𝐻 0 time-series were sorted from the highest to the lowest value. A parameter was defined to calculate the slope of each rise in 𝜎𝑉𝐻 0 time-series by measuring the increase in 𝜎𝑉𝐻 0 in 24 days. Starting from the highest slope in the list and considering the start time of the slope as the candidate PD, 𝜎𝑉𝐻 0 behavior was monitored 36-60 days after the candidate PD. P3.1 to P3.3 filter out PD candidates based on the criteria that 𝜎𝑉𝐻 0 must be less than 18 [dB] in 36-60 days after planting, and the 𝜎𝑉𝐻 0 time-series should not have a sharp rise higher than 4 [dB] within this period. 119 (a) (b) (c) (e) (d) (f) 0 for rainfed rice in year 2020 for different pixels Figure 2.3: Temporal backscatter profile of 𝜎𝑉𝐻 in the north and south part of Cambodia. The red line represents the sharp rise (difference between 0 ) associated with the planting and the start of the rise was taken as the time of sowing. 𝜎𝑃𝐷+12 A sharp rise in the 𝜎0 time-series was taken as an indicator of the sowing stage. − 𝜎𝑃𝐷 0 2.3.4. Yield estimation using RHEAS Tool To evaluate the efficacy of PD to estimate rice yield, the SAR-derived and crop calendar PDs were used as separate inputs to the crop model (M-DSSAT). 40 ensembles of yield scenarios were generated for each province to capture the variability due to soil, cultivars, PD, and forcings. A sequential resampling process to assign the most frequent SAR-derived PD to the ensembles in 120 each province was applied to make sure that the most probable dates from the SAR data were used. The resampling process is analogous to rolling an N-sided loaded die, where N is the number of SAR-derived PD (here N=12). The probability of rolling each side of the die is defined by weights calculated based on the fraction of the total area for each growing month (March to August) times the fraction of the month for each specific date. Hence, PDs with higher weights have higher probabilities of propagating through the ensembles. In addition to PD, fertilizer and irrigation application, and cultivar varieties also play an important role in simulating yield. To obtain a fair estimation of rice yields in each province, rice cultivar genetic coefficients must be calibrated using observation yields under no stress conditions (Boote, 1999). To calibrate the genetic coefficients, the two most common cultivars in Cambodia reported by FAO (Phka Rumduol and Sen-Pidao) were used. Phka Rumduol rice is a premium aromatic jasmine rice cultivated during the wet season with a long season, medium maturity, and taller photosensitive rice. Sen-Pidao is a soft cooking rice with a non-glutinous aromatic flavor, short duration, and early maturity which makes it similar to jasmine rice (Nadar et al., 2020). This study used the Shuffled Complex Evolution Algorithm to calibrate cultivar coefficients. This algorithm mainly consists of multiple complex shuffling and competitive evolution based on the simplex search method (Guo et al., 2013). The simulated yield and observed yield for 11 years (2010-2020) were used to optimize the cultivar coefficients. 2.4. Results 2.4.1. Rice mapping Our study found that vegetation indices such as RVI, DPSVI and 𝜎𝑉𝐻 0/𝜎𝑉𝑉 0 have the same efficiency on rice mapping and the results improved significantly as compared to the rice map in Figure 2.1b when the indices were applied to the rice mapping algorithm (Figure A1 in the supplementary material shows the detected rice map in 2020 before employing RVI, DPSVI and 121 𝜎𝑉𝐻 0/𝜎𝑉𝑉 0 in the rice mapping algorithm). While 𝜎𝑉𝐻 0 signal compared to 𝜎𝑉𝐻 0/𝜎𝑉𝑉 0, RVI and DPSVI showed a better capability in PD detection (Yang et al., 2021). The paddy fields as shown in Figure 2.4 are mostly found in low-altitude regions along the Tonle Sap Lake and Mekong River. However, rice fields are relatively sparse in the eastern part of Cambodia, which is consistent with the land cover map of Cambodia (2012) in Figure 2.1b. The presence of rice fields signature closer to the lake shores (around the lake) is caused by the similarity in 𝜎𝑉𝐻 0 time-series between wetland and paddy fields. According to the landcover of Cambodia, the northwestern region close to the Tonle Sap has number of wetlands and forests. To evaluate whether the rice mapping algorithm could distinguish between crops and not crops in this study, we did not use a non-cropland mask. A comparison of rice detection map (Figure 2.4) from our study to landcover map of Cambodia (Figure 2.1b) shows that our rice mapping algorithm is capable of distinguishing between crops and not-crops as well as rice from other crops. 122 Figure 2.4: Rice planted area estimated from Sentinel-1A 𝜎𝑉𝐻 2020). 0 observations in four years (2017- Furthermore, using the same approach we detected a significant expansion in the estimated rice planting areas in 2020 relative to the previous years (Figure 2.4). The reason for detecting more paddy fields in 2020 is the false alarms due to the flooding in Cambodia between September and November 2020 caused by 13 consecutive tropical storms that resulted in unprecedented flooding and landslides in 19 provinces (Tamesis et al., 2020). This flooding could have resulted in spurious fluctuations in 𝜎0, such as many sharp peaks and valleys in 𝜎0 time-series same as backscatter signatures over paddy fields, wetlands, and low-lying regions. 2.4.2. Maps of planting dates Figure 2.5 shows the PD retrieval estimated using an analytical 𝜎𝑉𝐻 0 time-series analysis. In this study 𝜎𝑉𝐻 0 time-series were used to detect paddy PD. The dates in the legend are derived from the overpass times of Sentinel-1A, which are composited using three satellite tracks covering 123 Cambodia. The estimated PD maps demonstrate that the PD not only changes throughout the country but also it varies within each province (Figure 2.6 and Figure 2.7). Seven of the 25 administrative units in Cambodia (the most cultivated areas (USDA 2020)) were selected to examine how SAR-derived PD may vary from commune to commune and what effects dynamic PD may have on simulated yields. Figure 2.6 and Figure 2.7 show PD at Banteay-Meanchey in the north and Svay-Rieng in the south of Cambodia, respectively. The dates in the legend are the exact observed dates of the Sentinel-1A images before compositing three satellite tracks over Cambodia. Figures 2.5, 2.6, and 2.7 collectively illustrate farmers planting rice in May thru July in 2020, while the planting season extensively varied from Mar thru Aug during 2017-2019. We hypothesize that delayed planting in 2019 (starting from April) and 2020 (starting from May) is a result of flooding in March 2019 and March and April 2020, specifically in the north of Cambodia (Banteay- Meanchey province). Inundation by flood can prevent farmers from using their ploughing machinery to prepare their fields, resulting in late planting (Figure A2 in the supplementary material shows the area inundated in Banteay-Meanchey between March, April and May of 2017 thru 2020). The SAR-based PD can be determined by observing the field before and after the PD in irrigated paddy fields (Yang et al., 2021), whereas in rainfed rice the 𝜎0 time-series needs to be monitored for 36-60 days after planting to distinguish the differences in real and spurious fluctuations (as described in the previous section) during the planting season (Figure 2.3). 124 Figure 2.5: The SAR-derived PD were estimated using analytical analysis of 𝜎𝑉𝐻 2017-2020. 0 time-series for 125 Figure 2.6: Planting dates map over Banteay-Meanchey province in the north of Cambodia. 126 Figure 2.7: Planting dates map over Svay-Rieng province in the south of Cambodia. 127 2.4.3. Impact of SAR-derived planting date on yield estimation To test whether SAR-derived PD improve the yield estimation, two simulations were conducted: 1) a data denial simulation by the M-DSSAT model using a fixed crop calendar PD (i.e., June 1 or DOY 152), and 2) a Simulation using the SAR-derived PD from Sentinel-1A observations. The only variable changing through the 40 ensembles between the two simulations is the PD. Hence, the difference between the fixed and SAR-derived PD is the only factor contributing to the estimated yield changes through ensembles in each province. Figure 2.8 and Figure 2.9 show the time difference between the fixed PD from the crop calendar and the SAR-derived PD for Banteay- Meanchey and Svay-Reing provinces, respectively. Green and red areas in the Figures refer to earlier and later PD, respectively, and yellow areas have an almost similar PD compared to the fixed date. According to the Figure 2.8 and Figure 2.9, the difference between SAR-derived PD and fixed date can be as high as 75 days (earlier or later). Figure 2.8: The difference between the fixed PD from the crop calendar (DOY 152) and the SAR- derived PD in Banteay-Meanchey in Cambodia's north. The +/- numbers indicate planting earlier and later than the fixed date. The subset represents the fraction of total area (%) for each PD difference. 128 Green areas in Figure 2.8 and Figure 2.9 indicate early planting in March and April. In the Banteay-Meanchey (as an example of a north province), planting is very rare during March and April, which is to be expected from a country known for its rainfed paddy fields. However, ~30 percent of the total area over Banteay-Meanchey is planted in March thru April in 2018, primarily due to the early onset of summer monsoon. According to our investigation, Svay-Rieng, which is a neighboring province to Vietnam uses irrigation for planting rice. This fact is also supported by the backscatter time-series pattern of this province (Figure 2.3a). Additionally, the March and April planting has declined in both provinces from 2017 to 2020. A 30% planting in March and April between 2017-2018 (-75 to -60 days PD difference) was declined to 15% between 2019- 2020 in Svay-Rieng (Figure 2.9). Changes in weather and lack of irrigation water could contribute to this decline. Figure 2.9: The difference between the fixed PD from the crop calendar (1st of June, DOY 152) and the SAR-derived PD in Svay-Rieng in Cambodia's south. Figure 2.10 and Figure 2.11 compare two simulated rice yields using a fixed PD and SAR- derived PD over Banteay-Meanchey and Svay-Rieng, respectively. In both simulations, all other 129 input parameters were kept the same through the ensembles, such as weather, soil type, crop type, cultivar coefficients, and fertilizer rate. Rather, only the fixed PD (June 1 or DOY 152) from the crop calendar was changed to the dynamic SAR-derived PD to determine the effect of the PD on rice yield estimation. The results illustrate that M-DSSAT model has mostly overestimated the rice yield using fixed PD while applying the SAR-derived PD to the M-DSSAT has improved both the average and median of simulated yield significantly. In general, estimated yield decreases as the PD moves earlier in the year and increases as the PD moves forward. Wang et al. (2017) studied the effects of shifting PD forward and backward with different fertilizer application rates on yield simulation using the CERES-Rice model. In general, they concluded that shifting PD backward and forward for Sen-Pidao and Phka-Rumduol cultivars resulted in a decrease and increase in yield respectively. Figure 2.10: Estimated rice yield in 2017-2020 in Banteay-Meanchey province. The gray and the blue boxes are showing the estimated rice yield using fixed and dynamic SAR-derived PD, respectively. The normalized bias was calculated using the average value of the simulated yield. 130 Figure 2.11: Estimated rice yield in 2017-2020 in Svay-Rieng province. The gray and the blue boxes are showing the estimated rice yield using fixed planting and the dynamic SAR-derived PD, respectively. According to Figures 2.10-11, and Table 2.1, the normalized bias of estimated yield decreased by ~1.5-12% in Banteay-Meanchey and ~27-40% in Svay-Rieng in the study period after using SAR-derived PD in crop model. With a higher fraction of total area for -75 to -60 days PD difference, Svay-Rieng province's yield bias is reduced more than Banteay-Meanchey's after using the SAR-derived PD in M-DSSAT model. Figure 2.8 illustrates that in 2020 the positive and negative PD difference compensate for one another in Banteay-Meanchey, and the 30 % area fraction for +15 difference days resulted in yield overestimation and a 1.5% increase in normalized bias (Table 2.2). In 2017, however, the area fraction for -75 to -60 days (PD difference) is higher in this province, resulting in a 12% reduction in the yield normalized bias compared to the fixed PD. Table 2.2 is showing the improvement in average estimated yield bias in all the seven provinces using the SAR-derived PD in M-DSSAT model compared to using fixed PD. Additionally, the interquartile range of estimated yield has been decreased in all the provinces by using SAR-derived PD in the crop model (Figures 2.10-11, and Table 2.2) resulting in a significant reduction in yield estimation uncertainty in almost all the provinces of Cambodia. For example, 131 the uncertainty in yield estimation was reduced by 38-65% in Banteay-Meanchey and 7-44% in Svay-Rieng. Table 2.2: Mean estimated rice yield in 2017-2020 using fixed and SAR-derived PD for seven provinces in Cambodia. The positive bias difference is showing an improvement in yield estimation using SAR-observed PD compared to using fixed PD in crop model and the negative bias difference is illustrating the increase in normalized bias. 132 YearProvinceYield (kg/ha) Fixed PD Yield (kg/ha) SAR derived PD Yield (kg/ha) ObservationNormalized bias % (Fixed PD) Normalized bias % (SAR derived PD) Bias difference %Interquartile range (Fixed PD)Interquartile range (SAR derived PD)Uncertainty Reduction %Banteay Meanchey3626.23244.93224.512.50.6+11.81287.3798.0+38.0Battambang2945.13080.42976.6-1.13.5-2.41532.31506.0+1.7Siemreap3614.52639.62581.540.02.2+37.83110.52050.3+34.1KampongThum3140.72554.92659.618.1-3.9+22.01728.01033.3+40.2Prey Veng4135.53596.53100.133.416.0+17.43519.82690.0+23.6Svay Rieng3733.82583.72578.244.80.2+44.62511.02128.8+15.2Takeo3592.83097.53029.418.62.3+16.42643.81427.5+46.02017YearProvinceYield (kg/ha) Fixed PD Yield (kg/ha) SAR derived PD Yield (kg/ha) ObservationNormalized bias % (Fixed PD) Normalized bias % (SAR derived PD) Bias difference %Interquartile range (Fixed PD)Interquartile range (SAR derived PD)Uncertainty Reduction %Banteay Meanchey3555.83193.03248.19.5-1.7+7.81091.5385.5+64.7Battambang3344.93050.13159.75.9-3.5+2.41695.01332.3+21.4Siemreap3948.72377.02604.751.6-8.7+42.93053.81869.3+38.8KampongThum2741.52592.32756.3-0.5-5.9-5.41583.0765.0+51.7Prey Veng4222.53211.23207.831.60.1+31.53962.02440.5+38.4Svay Rieng3820.92607.12487.253.64.8+48.82885.51755.0+39.2Takeo3546.93371.33071.415.59.8+5.71715.81208.8+29.62018YearProvinceYield (kg/ha) Fixed PD Yield (kg/ha) SAR derived PD Yield (kg/ha) ObservationNormalized bias % (Fixed PD) Normalized bias % (SAR derived PD) Bias difference %Interquartile range (Fixed PD)Interquartile range (SAR derived PD)Uncertainty Reduction %Banteay Meanchey3817.53580.93386.112.75.8+7.01150.0621.5+46.0Battambang3314.13050.12947.812.43.5+9.01458.31451.8+0.4Siemreap3182.93068.22367.334.529.6+4.81896.52306.3-21.6KampongThum3024.82641.92695.912.2-2.0+10.21373.0864.0+37.1Prey Veng3691.03155.73250.013.6-2.9+10.72351.02253.0+4.2Svay Rieng3380.02527.72567.731.6-1.6+30.12091.01951.3+6.7Takeo3472.73182.23082.012.73.3+9.41437.31644.8-14.42019YearProvinceYield (kg/ha) Fixed PD Yield (kg/ha) SAR derived PD Yield (kg/ha) ObservationNormalized bias % (Fixed PD) Normalized bias % (SAR derived PD) Bias difference %Interquartile range (Fixed PD)Interquartile range (SAR derived PD)Uncertainty Reduction %Banteay Meanchey3815.03862.83259.917.018.5-1.51411.0850.3+39.7Battambang3768.93180.73038.124.14.7+19.42051.01413.3+31.1Siemreap3555.33056.42770.528.310.3+18.02788.32177.8+21.9KampongThum2923.62844.92837.63.00.3+2.81309.31012.3+22.7Prey Veng4272.23079.43343.527.8-7.9+19.93669.32178.8+40.6Svay Rieng3891.62301.02585.050.5-11.0+39.62829.81583.5+44.0Takeo3863.23239.03093.324.94.7+20.22496.51532.8+38.62020 2.5. Discussion 2.5.1. Capabilities and limitations of SAR in acreage and planting date detection We demonstrated the utility of SAR imagery for the detection of lowland rice, particularly in tropical and subtropical regions where widespread cloud cover and complex cropping systems and land use changes (e.g., forestation) prevail, thus preventing the use of optical imagery. Rice mapping in Cambodia has become feasible due to the high spatial resolution of SAR images (<=100 m) as the paddy fields are typically less than two hectares and, in some cases, less than one hectare in this region (Pandey et al., 2010). Moreover, as a result of flooding, soil moisture and surface roughness were no longer a factor affecting backscattered SAR signals in rice fields, which has made SAR observations more effective in determining rice PD. For accurate detection of rice fields, the number of SAR acquisitions should cover the whole season that lasts three to four months (Nelson et al., 2014). On the other hand, from the perspective of data mining, using all images for one year would maximize rice mapping accuracy. The SAR-based PD can be determined by utilizing two images before and after the PD in irrigated paddy fields (Yang et al., 2021), whereas in rainfed rice-planted areas such as in Cambodia the backscatter time-series need to be monitored for ~36-60 days after planting to distinguish the actual increasing trend (caused by rice planting) and spurious fluctuations (frequent fall and rise due to preparation of the field before planting or some other reasons as illustrated in Figure 2.3). 2.5.2. Impact of 12 days Sentinel revisit time on SAR-derived planting date accuracy The yearly variations of SAR-derived PD in Figure 2.5 are related to the shifting of the monsoon season (Loo et al., 2015). Due to the unpredictable nature of the monsoon onset and pattern, rice yield estimation using the SAR-derived PD can provide a more optimal estimate of 133 yield. Moreover, the tropical and subtropical climate in much of Asia means that rice can be cultivated with diverse cropping calendars and practices over very short distances (Nguyen et al., 2012). Moreover, each province in Cambodia has some variations in its planting schedules, in addition to the annual changes. Considering the fact that farmers' decisions for planting are likely based on rules of thumb that include what their neighbors' whereabouts (Zhang et al., 2021), the PD clusters can be observed across each province (Figure 2.6 and Figure 2.7). Since, the Sentinel-1A revisit time is 12 days, it is not possible to determine the exact date of the planting using 𝜎0 time-series. The wall-to-wall coverage of Sentinel-1A granules over Cambodia was achieved by compositing the observations over 12 days, and the composited observation map assigns an average date to compute the first and the last dates of the 12 days interval. This approach ensures that the planting date uncertainty for a given pixel at 100 m resolution can be <= +/- 6 days. The 12-day overpass interval can be improved to 6-days by including descending Sentinel-1A overpasses. However, this complicates the PD retrieval algorithm due to different incidence angles and azimuth of Sentinel-1A descending overpass, as the algorithm performance depends largely on backscatter observations that have exact incidence angle and azimuth. It is also possible to improve the revisit interval by using a combination of Sentinel-1A and Sentinel-1B that leads to SAR backscatter within 6 days of revisit time. However, in this case, also, the difference in incidence angle and azimuth would reduce the performance of our algorithm. Moreover, Sentinel-1B is unavailable since 23 December 2021 and the combined product of Sentinel-1A and 1B cannot be used in the future (ESA, 2022). We used an indirect approach to validate the SAR-derived planting date through comparison with the rice yield data for the provinces in Cambodia. Our study demonstrated the significant improvement in the bias and RMSE of rice yields after using the SAR-derived planting dates instead of conventional use 134 of crop calendar in modeling rice production over the whole study area. 2.5.3. Rice yield estimation Table 2.2 shows the reduction in both normalized bias and the interquartile range (uncertainty) of the estimated yield as the impact of using SAR-derived PD in a crop model. A fixed PD in the M-DSSAT model can cause a wide range of crop yield estimations since the PD is constant for all the ensembles, which means that actual precipitation and temperature (i.e., water availability and growing-degree-days) for the crop growth duration may not actually represent with the use of fixed PD, resulting in an under- or overestimation of yield. While a SAR-derived PD is an observation of the farmers' decision to plant and represents the near actual ground conditions. As a consequence, the use of fixed PD in the crop model increased the number of outliers and interquartile range in all provinces, ultimately increasing the uncertainty of yield estimation. The irrigation application in the southern provinces led to the rice experiencing two or three growing seasons each year, which shifted the PD of the wet season to March and April, whereas the fixed PD is June 1st (DOY 152). Thus, the M-DSSAT model overestimated rice yield in the southern provinces (Prey-Veng, Svay-Rieng, and Takeo) using a fixed PD due to the impact of late planting on estimated yield (Wang et al., 2017). Figure 2.8 shows that half of the Banteay- Meanchey was planted in July in 2020, which resulted in an increase in yield estimation using SAR-derived PD. Another factor that is responsible for the variation in yield estimation is considering two different cultivars for 40 ensembles in each province. Depending on the cultivars used for each ensemble, PD have a different impact on yield estimation. Sen-Pidao, for example, shows a larger reduction when planting shifted earlier to March and April than Phka-Rumduol does. The variation in yields was less correlated with PD at Phka-Rumduol than it was at Sen-Pidao. 135 2.6. Conclusion In this study, an analytical time-series analysis algorithm was developed based on Sentinel- 1A backscatter time-series to detect rice fields and the PD in a paddy-dominated region. The study aimed to investigate the impact of using SAR-derived PD in a physically-based crop model to estimate rice yield. The results showed a significant departure of the SAR-derived PD from the crop calendar dates that are based on climatology (+/-75 days). According to our findings, detection of PD from SAR backscatter time-series analysis of rainfed paddy fields requires more consideration than that of irrigated rice fields. The Sentinel-1A 𝜎𝑉𝐻 0 time-series has been shown to improve the accuracy of PD detection in rainfed paddy areas while applying the 𝜎𝑉𝐻 0/𝜎𝑉𝑉 0 ratio to the algorithm improved the accuracy of rice mapping. The estimated yields were significantly improved by replacing the fixed PD from crop calendar with the SAR-derived PD. The average bias improved by 7-12% and 30-48% and the uncertainty in yield estimation was reduced by 38-65% and 7-44% in Banteay-Meanchey and Svay-Rieng respectively. PD retrieved in the months of March and April are from the region that is predominantly irrigated, particularly in the southeastern provinces that border Vietnam or it can be due to the early onset of the summer monsoon. As the mean yield in the rainfed lowland ecosystem is less than half of the irrigated rice yield, distinguishing between these two can help us to have an accurate estimation of rice production in the future. Using such an approach for future rice yield can help improve the overall accuracy of the total production estimations. 136 REFERENCES Abhishek, A., Das, N.N., Ines, A.V.M., Andreadis, K.M., Jayasinghe, S., Granger, S., Ellenburg, W.L., Dutta, R., Quyen, N.H., Markert, A.M., 2021. Evaluating the impacts of drought on rice productivity over Cambodia in the Lower Mekong Basin. Journal of Hydrology 599, 126291. Andreadis, K.M., Das, N., Stampoulis, D., Ines, A., Fisher, J.B., Granger, S., Kawata, J., Han, E., Behrangi, A., 2017. The Regional Hydrologic Extremes Assessment System: A software framework for hydrologic modeling and data assimilation. PLoS One 12, e0176506. Baigorria, G.A., Jones, J.W., O’Brien, J.J., 2008. Potential predictability of crop yield using an ensemble climate forecast by a regional circulation model. Agricultural and Forest Meteorology 148, 1353–1361. Boote, K.J., 1999. Concepts for calibrating crop growth models. DSSAT version 3, 179–199. Boschetti, M., Stroppiana, D., Brivio, P.A., Bocchi, S., 2009. Multi-year monitoring of rice crop phenology through time-series analysis of MODIS images. Int J Remote Sens 30, 4643–4662. Bouman, B. A. M., Kropff, M. J., Tuong, T. P., Wopereis, M. C. S., ten Berge, H. F. M., & van Laar, H. H. (2001). ORYZA2000: Modeling Lowland Rice. International Rice Research Institute, Los Banos. Laguna. Canisius, F., Shang, J., Liu, J., Huang, X., Ma, B., Jiao, X., Geng, X., Kovacs, J.M., Walters, D., 2018a. Tracking crop phenological development using multi-temporal polarimetric Radarsat-2 data. RSE, 210, 508–518. Copernicus Sentinel data (2022). Retrieved from ASF DAAC, processed by ESA. Daly, C., Halbleib, M., Smith, J. I., Gibson, W. P., Doggett, M. K., Taylor, G. H., Curtis, J., & Pasteris, P. P. (2008). Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. International Journal of Climatology, 28(15), 2031–2064. European space agency, 2022. End of mission of the Copernicus Sentinel-1B satellite. Fahad, S., Ihsan, M. Z., Khaliq, A., Daur, I., Saud, S., Alzamanan, S., Nasim, W., Abdullah, M., Khan, I. A., & Wu, C. (2018). Consequences of high temperature under changing climate optima for rice pollen characteristics-concepts and perspectives. Archives of Agronomy and Soil Science, 64(11), 1473–1488. Food and Agriculture Organization of the United Nations. FAOSTAT Statistical Database. [WWW Document], 2020. Friesen, J., Steele-Dunne, S.C., van de Giesen, N., 2012. Diurnal differences in global ERS scatterometer backscatter observations of the land surface. IEEE TGARS, 50, 2595–2602. General Directorate of Agriculture (GDA), 2020. Department of Agricultural Land Resources 137 Management. Phnom Penh, Cambodia. Guo Jun, Jianzhong Zhou, Qiang Zou, Yi Liu, and Lixiang Song. "A novel multi-objective shuffled complex differential evolution algorithm with application to hydrological model parameter optimization." Water Resources Management 27, no. 8 (2013): 2923-2946. Ines, A.V.M., Das, N.N., Hansen, J.W., Njoku, E.G., 2013. Assimilation of remotely sensed soil moisture and vegetation with a crop simulation model for maize yield prediction. RSE, 138, 149– 164. Inoue, Y., Sakaiya, E., Wang, C., 2014a. Potential of X-band images from high-resolution satellite SAR sensors to assess growth and yield in paddy rice. Remote Sensing 6, 5995–6019. Irwin, S., Hubbs, T., 2019. Late Planting and Projections of 2019 US Corn and Soybean Acreage. farmdoc daily 9. Jones, J.W., Hoogenboom, G., Porter, C.H., Boote, K.J., Batchelor, W.D., Hunt, L.A., Wilkens, P.W., Singh, U., Gijsman, A.J., Ritchie, J.T., 2003. The DSSAT cropping system model. European journal of agronomy 18, 235–265. Kamoshita, A., Ikeda, H., Yamagishi, J., Lor, B., Ouk, M., 2016. Residual effects of cultivation methods on weed seed banks and weeds in Cambodia. Weed Biology and Management 16, 93– 107. Kanamitsu, M., Ebisuzaki, W., Woollen, J., Yang, S.-K., Hnilo, J. J., Fiorino, M., & Potter, G. L. (2002). Ncep–doe amip-ii reanalysis (r-2). Bulletin of the AMS, 83(11), 1631–1644. Khabbazan, S., S. C. Steele-Dunne, P. Vermunt, J. Judge, M. Vreugdenhil, and G. Gao. "The influence of surface canopy water on the relationship between L-band backscatter and biophysical variables in agricultural monitoring." RSE 268 (2022): 112789. Kucharik, C.J., 2008. Contribution of planting date trends to increased maize yields in the central United States. Agronomy Journal 100, 328–336. Küçük, Ç., Taşkın, G., Erten, E., 2016. Paddy-rice phenology classification based on machine- learning methods using multitemporal co-polar X-band SAR images. IEEE J-STARS 9, 2509– 2519. Lasko, K., Vadrevu, K.P., Tran, V.T., Justice, C., 2018. Mapping double and single crop paddy rice with Sentinel-1A at varying spatial scales and polarizations in Hanoi, Vietnam. IEEE J- STARS 11, 498–512. Loo, Yen Yi, Lawal Billa, and Ajit Singh. "Effect of climate change on seasonal monsoon in Asia and its impact on the variability of monsoon rainfall in Southeast Asia." Geoscience Frontiers 6, no. 6 (2015): 817-823. Mandal, D., Kumar, V., Bhattacharya, A., Rao, Y.S., Siqueira, P., Bera, S., 2018. Sen4Rice: A processing chain for differentiating early and late transplanted rice using time-series Sentinel-1 138 SAR data with Google Earth engine. IEEE Geoscience and Remote Sensing Letters 15, 1947– 1951. Mandal, D., Kumar, V., Ratha, D., Dey, S., Bhattacharya, A., Lopez-Sanchez, J.M., McNairn, H., Rao, Y.S., 2020. Dual polarimetric radar vegetation index for crop growth monitoring using sentinel-1 SAR data, RSE, 247, 111954. McNairn, H., Brisco, B., 2004. The application of C-band polarimetric SAR for agriculture: A review. Canadian Journal of Remote Sensing 30, 525–542. McNairn, H., Jiao, X., Pacheco, A., Sinha, A., Tan, W., Li, Y., 2018. Estimating canola phenology using synthetic aperture radar. RSE 219, 196–205. McNairn, H., Shang, J., Jiao, X., Champagne, C., 2009b. The contribution of ALOS PALSAR multipolarization and polarimetric data to crop classification. IEEE Transactions on Geoscience and Remote Sensing 47, 3981–3992. Nadar, W., Ouk, M., Elsner, J., Brendel, T., Schubbert, R., 2020. The DNA fingerprint in food forensics part II: The Jasmine rice case. Agr. Food Ind. Technol 31, 4–8. National Institute of Statistics (NIS), M. of P., 2019. Cambodia inter-censal agriculture survey 2019 (CIAS19) final report. Nasirzadehdizaji, Rouhollah, Fusun Balik Sanli, Saygin Abdikan, Ziyadin Cakir, Aliihsan Sekertekin, and Mustafa Ustuner. "Sensitivity analysis of multi-temporal Sentinel-1 SAR parameters to crop height and canopy coverage." Applied Sciences 9, no. 4 (2019): 655. Nelson, A., Setiyono, T., Rala, A.B., Quicho, E.D., Raviz, J. v, Abonete, P.J., Maunahan, A.A., Garcia, C.A., Bhatti, H.Z.M., Villano, L.S., 2014. Towards an operational SAR-based rice monitoring system in Asia: Examples from 13 demonstration sites across Asia in the RIICE project. Remote Sensing 6, 10773–10812. Nguyen, H., Shaw, R., Prabhakar, S., 2010. Climate change adaptation and disaster risk reduction in Cambodia, in: Climate Change Adaptation and Disaster Risk Reduction: An Asian Perspective. Emerald Group Publishing Limited. Nguyen, T.T.H., de Bie, C., Ali, A., Smaling, E.M.A., Chu, T.H., 2012. Mapping the irrigated rice cropping patterns of the Mekong delta, Vietnam, through hyper-temporal SPOT NDVI image analysis. Int J Remote Sens 33, 415–434. Pandey, S., Byerlee, D., Dawe, D., Dobermann, A., Mohanty, S., Rozelle, S., Hardy, B., 2010. Rice in the global economy. Los Banos, Phillipines: International Rice Research Institute. Phung, H.-P., Nguyen, L.-D., Nguyen-Huy, T., Le-Toan, T., Apan, A.A., 2020a. Monitoring rice growth status in the Mekong Delta, Vietnam using multitemporal Sentinel-1 data. Journal of Applied Remote Sensing 14, 014518. Sacks, W.J., Deryng, D., Foley, J.A., Ramankutty, N., 2010. Crop planting dates: an analysis of 139 global patterns. Global ecology and biogeography 19, 607–620 Son, N.-T., Chen, C.-F., Chen, C.-R., Toscano, P., Cheng, Y.-S., Guo, H.-Y., Syu, C.-H., 2021. A phenological object-based approach for rice crop classification using time-series Sentinel-1 Synthetic Aperture Radar (SAR) data in Taiwan. International Journal of Remote Sensing 42, 2722–2739. Steele-Dunne, S.C., McNairn, H., Monsivais-Huertero, A., Judge, J., Liu, P.-W., Papathanassiou, K., 2017. Radar remote sensing of agricultural canopies: A review. IEEE JSTARS 10, 2249–2273. Stöckle, C.O., Donatelli, M., Nelson, R., 2003. CropSyst, a cropping systems simulation model. European journal of agronomy 18, 289–307. Pauline Tamesis, Claire Conan, & Kristen Rasmussen. (2020). Flood Response Plan, Cambodia. Tsimba, R., Edmeades, G.O., Millner, J.P., Kemp, P.D., 2013. The effect of planting date on maize grain yields and yield components. Field Crops Research 150, 135–144. Urban, D., Guan, K., Jain, M., 2018. Estimating sowing dates from satellite data over the US Midwest: a comparison of multiple sensors and metrics. RSE, 211, 400–412. U.S. Department of Agriculture, 2020. Cambodia MAFF annual https://ipad.fas.usda.gov/rssiws/al/crop_production_maps/seasia/Cambodia_Rice.png 13 July 2022). report statistics. (accessed Van Ittersum, M. K., Leffelaar, P. A., van Keulen, H., Kropff, M. J., Bastiaans, L., & Goudriaan, J. (2003). On approaches and applications of the Wageningen crop models. European Journal of Agronomy, 18(3–4), 201–234. Veloso, A., Mermoz, S., Bouvet, A., le Toan, T., Planells, M., Dejoux, J.-F., Ceschia, E., 2017. Understanding the temporal behavior of crops using Sentinel-1 and Sentinel-2-like data for agricultural applications. RSE, 199, 415–426. Wang, L.-F., Kong, J., Ding, K., le Toan, T., Ribbes, F., Floury, N., 2005. Electromagnetic scattering model for rice canopy based on Monte Carlo simulation. Progress In Electromagnetics Research 52, 153–171. Wang, Q., Chun, J.A., Lee, W.-S., Li, S., Seng, V., 2017. Shifting planting dates and fertilizer application rates as climate change adaptation strategies for two rice cultivars in Cambodia. 한국기후변화학회지 8, 187–199. Yang, H., Pan, B., Li, N., Wang, W., Zhang, J., Zhang, X., 2021. A systematic method for spatio- temporal phenology estimation of paddy rice using time-series Sentinel-1 images. RSE 259. Zhang, M., Abrahao, G., Cohn, A., Campolo, J., Thompson, S., 2021. A MODIS-based scalable remote sensing method to estimate sowing and harvest dates of soybean crops in Mato Grosso, Brazil. Heliyon 7, e07436. 140 APPENDIX Figure A1: Rice detection map using Sentinel-1 data in 2020 A) without, and B) with employing cross ratio in the rice mapping algorithm. Figure A2: Flooded area in Banteay-Meanchey province during March, April and May of 2017 - 0 < −25 𝑑𝐵 are selected as the flooded area. 2020. The pixels with 𝜎𝑉𝐻 141 3. CHAPTER 3: YIELD ESTIMATION FROM SAR DATA USING PATCH-BASED DEEP LEARNING AND MACHINE LEARNING TECHNIQUES 142 3.1. Introduction Enhancing our ability to forecast crop yields is pivotal for various socioeconomic agricultural activities, such as disaster/drought management, market strategizing, yield mapping for efficient harvest management, and insurance planning (J. Sun et al., 2019). Predicting yields is, however, prone to uncertainties due to myriad influencing factors. These factors encompass the plant's genotype, soil characteristics, weather patterns during the growing season, cultivation techniques, and vulnerability to biotic threats (Dadhwal., 2003). The complexity is further compounded by the necessity to monitor and analyze data throughout the entire crop growth cycle, requiring sustained observation and the integration of a vast array of information to make optimal predictions (Leroux et al., 2019). For example, the United States Department of Agriculture (USDA) offers monthly state-level yield projections through its Objective Yield (OY) service. For Soybeans, for example, OY surveys commence on July 25th and persist until the season's end. Yet, a delay exists in accessing county-specific Soybean yield data, with the USDA only publishing these granular estimates in March of the following year. Consequently, there is an urgent need for quick and accurate county-level yield predictions. This data would help farmers and sellers make better marketing decisions and manage their harvests more effectively (J. Sun et al., 2019). Beyond the on-site survey, numerous modeling techniques have evolved over time, each offering unique benefits and drawbacks towards agricultural yield prediction. Traditional approaches, such as empirical-based and process-based models, have found extensive application, with the Agricultural Production Systems Simulator (APSIM) and Soil Yield Model (SYM) serving as representative examples (Bolton & Friedl, 2013, Abhishek et al., 2023). Hybrid models blend elements from both methodologies, enhancing the accuracy of yield prediction (Nana et al., 2014). However, the requirement for comprehensive agro-environmental parameters can limit the 143 applicability of complex simulation-based models predominantly to local scales. Simpler empirical models, while more straightforward to parameterize, suffer from similar limitations, often being restricted to the dataset and study site for which their relationships were initially established (Battude et al., 2016). Crop simulation models (CSMs) for example Decision Support System for Agrotechnology Transfer (DSSAT) (Jones et al., 2003), despite offering insightful predictions at the field scale, often encounter challenges in their broader implementation due to data availability constraints. Furthermore, the quality and precision of the input parameters significantly influence the reliability of model outcomes. For instance, the planting date is a crucial parameter for CSMs, often derived from broad, nationally scaled crop calendars. These calendars generally propose a single planting date for the entire country over several years, which doesn't account for temporal or regional variations (Hashemi et al., 2022). This simplification could potentially compromise the accuracy of CSM predictions. As a result, the quality of other input variables such as irrigation, fertilizer, cultivar, soil, and climate data can affect the model performance. In this context, the integration of remote sensing (RS) data emerges as a potent solution to alleviate these constraints (El-Hajj et al., 2016). Specifically, a novel trend in this field involves harnessing RS data and deep learning (DL) to enable large-scale yield prediction. In recent years, the integration of optical and thermal satellite imagery with DL techniques has emerged as a powerful approach for advancing crop yield predictions. Recent research highlights the frequent use of optical data from platforms like AVHRR, Landsat-8, MODIS, UAV, and Sentinel-2. In terms of DL methodologies, CNNs, LSTM, and ConvLSTM are the frontrunners (Muruganantham et al., 2022). Notably, the 3D-CNNs model has been identified as particularly effective for Soybean yield predictions when leveraging optical data, especially from MODIS (Fernandez-Beltran et al., 2021; Qiao et al., 2021; Russello, 2018; Terliksiz and Altýlar, 2019; 144 Abbaszadeh et al., 2022; Gavahi et al., 2021). J. Sun et al. (2019) showcased the edge ConvLSTM has over both CNN and LSTM in Soybean yield predictions, utilizing MODIS data, climate data, land surface temperature (LST), and surface reflectance data for both end-of-season and in-season Soybean yield prediction at the county-level in the CONUS. This observation is consistent with the findings of Q. Yang et al. (2019) during the crop's ripening phase. Interestingly, CNNs slightly surpassed LSTM in their predictive capabilities. Reinforcing this, studies by, Qiao et al. (2021) and Fernandez-Beltran et al. (2021) verified the superiority of 3D-CNNs over its 2D counterpart and LSTM when predicting yields using MODIS, combined with soil and climatic information. While these advancements underscore the significant potential of optical and thermal satellite imagery combined with DL techniques, they also bring to light certain limitations inherent in these satellite data, particularly when assessing crop health such as vegetation water content (VWC) and dealing with environmental interferences. The optical and thermal indices (such as NDVI, NDWI, and NDII) prove effective under low vegetation conditions, particularly when the VWC is less than 4 kg/m2 (Cosh et al., 2019b). However, these indices are not able to detect water content below the canopy surface, including in the stem and ears of crops. This inability to accurately measure plant hydration levels can result in the underestimation of plant water and saturation at varying points throughout the growing season (Judge et al., 2021; Togliatti et al., 2022). Consequently, this could contribute to an underestimation of crop yield, revealing a limitation of relying solely on these indices for comprehensive crop monitoring and yield prediction. Moreover, the optical and thermal sensors are adversely affected by the presence of clouds, and they also get affected by background, aerosol, and saturation in high biomass regions (Soudani et al., 2008). Therefore, multispectral optical imagery alone may not suffice in complex and diverse environments to discriminate summer crops (McNairn et al., 2014; Skakun et al., 2015). 145 Conversely, scientists have leveraged high-resolution (approximately 10-meter) multi- temporal Synthetic Aperture Radar (SAR) images/observations for crop classification and monitoring since 1990, using time series analysis methods and traditional machine learning (ML) techniques (Hashemi et al., 2023; Yang et al., 2021b). These SAR images offer significant advantages due to their all-weather and day-night monitoring capabilities. SAR sensors exhibit a keen sensitivity to a multitude of vegetation attributes. These include dielectric properties, size, shape, orientation, roughness, and the distribution of various plant components like leaves, stems, and fruits (McDonald et al., 2000; Steele-Dunne et al., 2017). Such nuanced detection capabilities empower SAR sensors to capture fairly accurately different growth stages and structural variations of crops (Thorp and Drajat., 2021; Zhao et al., 2022) which is helpful in estimating crop yield. Highlighting SAR's ability towards yield prediction, Clauss et al. (2018) used Sentinel-1 SAR time series data to accurately estimate rice production in multiple locations, including China, California, and Spain. Their approach, which employed super-pixel segmentation and a phenology-based decision tree, demonstrated a strong correlation with district-level data from province statistics offices, showcasing the potential of SAR time series data to estimate rice production. Alebele et al. (2021) showed that the synergetic use of Sentinel-1 and -2 can be an effective approach for crop yield estimation using Gaussian kernel regression while Sharma et al. (2022) showed that the VH-polarization-based artificial neural network (ANN) model outperformed the VV polarization- based model in paddy rice yield estimation with an R2 of 0.72 and an RMSE of 600.1 kg/ha. den Besten et al. (2023) aimed to improve our understanding of the relationship between Sentinel-1 backscatter and variations in sugarcane yield as well as waterlogging effects. The analysis was carried out on an irrigated sugarcane plantation, utilizing a substantial dataset of sugarcane yield. By examining different seasons, the study investigated the correlation between backscatter and 146 sucrose yield variability. Notably, the findings indicated a connection between VV backscatter and stalk development, which serves as a critical reservoir for sucrose accumulation. Despite the SAR potential in yield estimation, its integration with DL remains a largely untapped but promising area for crop monitoring applications and yield prediction. The reliance of DL on extensive training datasets poses a significant challenge in crop monitoring (e.g., crop phenology and biophysical parameters (BPs) estimation) and yield estimation, as obtaining accurate and comprehensive ground-reference data for such applications is both resource-intensive and costly. Crucially, SAR data, while rich in information, can often be noisy, and difficult to interpret, especially in agricultural applications where a wide range of factors can influence the signals received. Conversely, the multi-layered structure and sophisticated learning capabilities of DL can be particularly beneficial in this context. By utilizing a deep, hierarchical learning approach, DL can efficiently extract and interpret salient features from SAR observations, effectively managing and deciphering the intrinsic noise present in SAR data (Kamilaris and Prenafeta-Boldú, 2018). Overlooking factors like planting date, cultivar, soil type, irrigation, and fertilizer usage, our approach in this study use SAR backscatter, and climate data, utilizing patch-based 3D-CNNs, and Random Forest (RF), XGBoost, and Support Vector Machine (SVM) as ML methods, to accurately predict both in-season and end-of-season yields for Corn, Soybean, and Winter Wheat. Our approach, as the first known effort to predict field-scale crop yield using SAR and DL, was demonstrated in a Michigan case study area lacking irrigation. It includes eight-year (2016-2023) yield data, spanning diverse weather conditions, and aims to create a model applicable to various climate scenarios. This paper will address these pivotal questions: 1. How effective are DL and ML models in estimating crop yields from SAR imagery? 147 2.Which SAR features have the most significant impact on yield estimation using ML and DL techniques? 3.What is the earliest possible point in the growth cycle for accurate yield estimation using SAR imagery and DL and ML techniques? 3.2. Study Area and Material 3.2.1. Case Study This study focuses on the W.K.Kellogg Biological Station (KBS) in Michigan as its primary case study. As an integral part of the U.S. Long-Term Agroecosystem Research (LTAR) Network, established by the USDA, KBS is committed to formulating sustainable intensification strategies for agricultural production. This network is a synergy of 18 extensive research sites across the U.S., all collaborating towards a common goal. Our study centers on three pivotal crops, Corn, Soybean, and Winter Wheat, all grown within the confines of the KBS. The selected fields, which span from 1.3 to 17.5 hectares in size, are depicted in Figure 3.1, showcasing the field distribution in the area. Notably, most of these fields experience crop rotation annually to enhance crop yield rates. The growth cycle for each crop varies with corn and Soybean cultivation extending from May to October, and Winter Wheat from September to July. Unique to these fields is their lack of irrigation systems; they rely solely on rainfall for crop growth. 148 Figure 3.1: Location of Corn, Soybean, and Winter Wheat fields within Michigan's KBS site. The fields, depicted by the orange shapes on the map, undergo annual crop rotation. 3.2.2. Dataset and input features to the DL and ML models a. SAR data The ascending Radiometrically Terrain Corrected (RTC) Sentinel-1A images, spanning eight years (2016-2023) from May to October were downloaded from the Alaska Satellite Facility (ASF). Subsequently, these images underwent speckle filtering. The SAR images were originally available at a 30-meter resolution and were aggregated to 50-meter resolutions through the bilinear sampling method to mitigate the effects of speckle and noise. An exploration of two resolutions will be undertaken for yield prediction, aimed at determining the optimal Sentinel-1 observation resolution for such prediction and investigating the capability of DL in mitigating the noise effect. To ensure our analysis is not influenced by varying planting dates, which typically occur from mid-May to early June for corn and soybean and end of September to early November for Winter Wheat, we selected SAR data from June 1st to the end of October for Corn and Soybean, and from May 1st to the end of July for Winter Wheat. This approach allows for a more consistent assessment of the growth cycles for these crops. Feature inputs to the DL and ML models encompassed the VH and VV polarization channels (measured in decibel), along with the 149 polarization ratio 𝜎𝑉𝐻 0 0⁄ 𝜎𝑉𝑉 , Radar Vegetation Index (RVI), 4𝜎𝑉𝐻 0 ⁄ (𝜎𝑉𝐻 0 + 𝜎𝑉𝑉 0) (Nasirzadehdizaji et al., 2019), and Radar Cross-Section Polarization Ratio, (𝜎𝑉𝑉 0-𝜎𝑉𝐻 0)/( 𝜎𝑉𝑉 0+𝜎𝑉𝐻 0) (Nasirzadehdizaji et al., 2019) measured in linear unit. Beyond the aforementioned parameters, the incidence angle was deliberately included as a key input feature, thereby precluding the necessity for separate incidence angle correction. It's important to highlight that, given the geographical concentration of all field sites within a single Sentinel-1 tile with exact repeat, there is no variations in the incidence angle. However, in anticipation of broader applications in future research, this parameter was integrated into the input features to enhance the model's adaptability and precision across diverse locations. b. Climate data Khaki & Wang. (2019) identified precipitation and minimum and maximum temperature as the pivotal climate factors influencing yield prediction. Leveraging the resources of the KBS- LTER weather station (https://lter.kbs.msu.edu/datatables/7), our study focused exclusively on daily precipitation data along with minimum and maximum temperature readings, the finest resolution climate data for our study. Accumulated precipitation and average minimum and maximum temperature were computed up to each satellite revisit time. Finally, the training dataset to DL and ML models amalgamates 9 derived features: six SAR- based features (𝜎𝑉𝐻 0, 𝜎𝑉𝑉 0, 𝜎𝑉𝐻 0/𝜎𝑉𝑉 0, RVI, (𝜎𝑉𝑉 0-𝜎𝑉𝐻 0)/( 𝜎𝑉𝑉 0+𝜎𝑉𝐻 0), and incidence angle) and three climate-based features (precipitation, minimum temperature, and maximum temperature), with a depth of 8-14 (number of Sentinel-1 images). d. Yield data For yield data, we employed the yield mapping feature from John Deere software, acquired from KBS. The John Deere Operations Center is a cutting-edge, cloud-based farm management 150 platform provided by leading agricultural machinery manufacturer, John Deere. Its standout feature is the yield mapping function, which leverages data from harvesting machinery to generate a visual representation of yield variation across the fields (Luck and Fulton, 2014). We harnessed real-time grain yield estimates, measured in bushels per acre (bu/acre), from each GPS-tracked location in the field. This dataset, spanning from 2016 to 2023, provided detailed point measurements of both moisture and dry yield for crops like Corn, Soybean, and Winter Wheat, and was stored in a shapefile format. With data points captured at intervals less than 5 meters apart, the dataset boasts high resolution. To align the yield data with our primary training dataset from Sentinel-1, with 30- and 50-meter resolution, we generated two yield TIFF files. These files were created by averaging the yield values of points within each pixel. Figure 3.2 showcases the visual representation of point measurements exported from the software, alongside the 30 and 50-meter TIFF files. Figure 3.2: Spatial distribution of yield in a 2022 KBS Corn field: a) Point-based measurements, b) high-resolution yield map at 30-meter resolution (bu/acre), and c) coarser yield map at 50-meter resolution (bu/acre). 151 3.3. Patch-Based Regression Methodology for Yield Estimation In this study, we utilized the 3D-CNNs DL architecture, along with three ML techniques, which include Random Forest (RF), Support Vector Machine (SVM), and XGBoost, to estimate crop yield based on the spatial-temporal variations observed in SAR data. Below, we provide a concise overview of the patch-based 3D-CNNs architecture and its associated hyperparameters, as well as those of the ML techniques employed. Our modeling endeavors were implemented in python using library functions provided by the PyTorch framework. 3.3.1. 3D Convolutional Neural Network (3D-CNNs) Convolutional Neural Networks (CNNs) are known for their ability to perform classification and regression tasks with high accuracy due to their hierarchical structure and large learning capacity (Oquab et al., 2014). They can accept different forms of data as input, including images, speech, natural language, audio, and video (Kamilaris and Prenafeta-Boldú, 2018). Convolutional layers, activation function, fully connected layers, and pooling layers are the main components of the CNNs. CNNs can be used in 1D across the spectral or temporal dimension, 2D across the spatial dimensions, or 3D across the spectral and spatial dimensions. While 3D convolution has shown slightly higher accuracy in crop classification (Kussul et al., 2017), CNNs are rarely used as feature extractors for the temporal domain of remotely sensed time series (Zhong et al., 2019b). Cué La Rosa et al. (2019) showed the superiority of patch-based 3D-CNNs over 3D-FCN and RF for crop classification using sentinel-1. Fontanelli et al. (2022) and Teimouri et al. (2022) also demonstrated that 3D-CNNs outperform 1D- and 2D-CNNs in crop classification. Significantly, this study pioneers the evaluation of patch-based 3D-CNNs for yield prediction using Sentinel-1. Information from each field SAR signal, climate, and yield data were selected as data and label patches, to be input into the model. Given the unique characteristics of our yield 152 data, which is field-specific and lacks spatial interconnectivity between fields, we employed a patch-based regression methodology. This approach facilitates the processing of individual patches in isolation, enabling a more granular and targeted analysis. Each image patch underwent an independent normalization process using the z-score method, where the mean was subtracted from each data point and then divided by the standard deviation. This normalization stabilizes the training dynamics and facilitates a more predictable learning process by centering the data around a mean of 0 and a standard deviation of 1. Figure 3.3 illustrates the architecture of the patch-based 3D-CNNs, while Table 3.1 presents the hyperparameters of the model, which were optimized for the best performance on the validation set during the model training process. Figure 3.3: Patch-based 3D-CNNs architecture for yield prediction using Sentinel-1 time series. Given the variability in patch sizes, the depth (time steps), width, and height dimensions of our input remained unchanged in different layers, while the channel dimension increased and reduced progressively in the encoder and decoder parts. In this DL model, a set of patches is defined as a batch number, and the model weights are updated after processing this specific number of patches. The implementation of early stopping, with a patience level set at 150, effectively curtailed the training process before it could be overfitted to the training data. This approach ensured that the model generalized well to unseen data, providing a robust mechanism to safeguard 153 against overfitting while also optimizing computational efficiency. Table 3.1: Detailed Breakdown of Hyperparameters Employed in the 3D-CNNs Model. Params lr Lr Scheduler T_max (scheduler) Eta_min (scheduler) Optimizer Loss function Batch size Weight decay Value 0.01 Cosine Annealing 1000 0.0001 Adam L1 Loss 32 10-6 To optimize the model's efficacy, we partitioned our dataset into training, validation, and test subsets. The training subset plays a pivotal role in refining the model's parameters, while the validation subset is instrumental for hyperparameter optimization and mitigating overfitting through early stopping. The test subset, on the other hand, serves as a benchmark to gauge the model's prowess on novel data. Table 3.2 and Table 3.3 depict the number of patches and pixels used as the training, validation, and test datasets for two different resolutions of SAR data. To protect against the model potentially memorizing the training data and strengthen its generalization capabilities, we incorporated an element of randomness by shuffling the training dataset at the start of each epoch. This step is crucial to prevent any unintentional patterns that might arise from the sequential ordering of data samples. We employed vertical and horizontal flipping as our augmentation method. Additionally, we employed transfer learning, adopting initial weights from a 3D-CNNs crop classification model that was trained using Sentinel-1 features as input features and Cropland Data Layer (CDL)-USDA-NASS (Boryan et al., 2011) crop-type data as labels. In terms of model configuration, since yield prediction is a regression problem, we used Mean Squared Error (MSE) as the loss function, and there is no activation function in the output layer. We also used K-Fold Cross-Validation to provide a more robust evaluation on the model by training and validating it on 5 different subsets of the data. 154 Table 3.2: Sample set distribution in the study area-30 meter (without augmentation). Category Corn Soybean Winter Wheat all 47 65 50 Total pixel 3955 3194 3777 all 162 10926 Number of patch/fields train 36 49 36 126 validation 6 6 6 14 test 5 8 8 22 Table 3.3: Sample set distribution in the study area-50 meter (without augmentation). Category Corn Soybean Winter Wheat all 37 41 39 train 1383 1326 924 all 117 3633 Number of patch/fields train 28 30 30 86 validation 4 4 4 12 test 5 7 5 19 3.3.2. Machine Learning techniques Random Forest (RF): RF is an ensemble learning method that constructs multiple decision trees during training and outputs the mode of the classes (for classification) or mean prediction (for regression) of individual trees for unseen data (Breiman., 2001). In yield prediction, it can capture complex nonlinear relationships in the data by aggregating the predictions of numerous trees, enhancing accuracy and robustness (Dang et al., 2021). In the implementation of the RF model, hyperparameter tuning was conducted using validation dataset to optimize the model's predictive performance. The finalized hyperparameters, post-tuning, are as follows: the number of trees was set to 400, the maximum depth of the tree was limited to 40, the minimum number of samples required to split an internal node was set to 2, the minimum number of samples required to be at a leaf node was set to 1, and bootstrapping samples were used, set to True. 155 Support Vector Machine (SVM): SVM is a supervised learning algorithm that finds the hyperplane that best divides a dataset into classes. For yield prediction, the SVM can be used in its regression form, known as Support Vector Regression (SVR) (Dang et al., 2021). SVR tries to fit the best line within a predefined margin, aiming to predict continuous values like yield, while maximizing the distance from the data points to this line. Similarly, the SVR model was fine-tuned to achieve optimal predictive accuracy. The hyperparameters were configured as follows after the tuning process: the cost parameter was set to 10, the kernel coefficient was defined as 0.01, and a linear kernel function was employed. eXtreme Gradient Boosting (XGBoost) is an optimized gradient boosting library that uses decision trees in a sequential manner, where each tree corrects the errors of its predecessor (Chen and Guestrin, 2016). In the context of yield prediction, XGBoost can handle missing values, capture nonlinear relationships, and is known for its high performance and speed, making it a popular choice for many predictive tasks. Furthermore, the XGBoost model was meticulously tuned to enhance its predictive performance. The finalized hyperparameters, post-tuning, are as follows: the number of boosting rounds was set to 400, the learning rate was established at 0.1, the maximum depth of a tree was limited to 8, the fraction of columns to be randomly sampled for each tree was determined to be 0.7, and the fraction of samples used to train each tree was configured at 0.9. In the context of our ML methodologies, it is pivotal to note that explicit feature engineering was not conducted. This decision is substantiated by the inherent nature of SAR derived features, which, in themselves, can be perceived as a form of feature engineering. For instance, the features provide various polarimetric variables such as 𝜎𝑉𝐻 0 and 𝜎𝑉𝑉 0, along with derived ratios and indices like 𝜎𝑉𝐻 0/𝜎𝑉𝑉 0, 4𝜎𝑉𝐻 0 ⁄ (𝜎𝑉𝐻 0 + 𝜎𝑉𝑉 0) and (𝜎𝑉𝑉 0-𝜎𝑉𝐻 0)/( 𝜎𝑉𝑉 0+𝜎𝑉𝐻 0), which are formulated 156 through combinations of the aforementioned polarimetric variables. These inherently encapsulate significant information and variability, thereby alleviating the necessity for additional feature engineering. 3.3.3. Model Assessment Techniques To assess the accuracy of the predicted yield in comparison to the actual labels, several statistical metrics tailored for regression problems were selected: Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Mean Average Percentage Error (MAPE), Normalized RMSE (NRMSE), Correlation coefficient (r), and Index of Agreement (d). 𝑀𝐴𝐸 = 1 𝑁 ∑ |𝑃𝑖 − 𝑂𝑖| 𝑁 𝑖=1 (1) ∑ (𝑃𝑖 − 𝑂𝑖)2 (2) 𝑅𝑀𝑆𝐸 = √1 𝑁 𝑖=1 𝑁 𝑁𝑅𝑀𝑆𝐸 = 𝑅𝑀𝑆𝐸 𝑟 = 1 𝑁−1 𝑑 = 1 − 𝑀𝐴𝑃𝐸 = ∑( )( 𝑃𝑖−𝑃̅ 𝑆𝑝 𝑁 ∑ (𝑃𝑖−𝑂𝑖)2 𝑖=1 𝑁 ∑ (|𝑃𝑖−𝑂̅|+|𝑂𝑖−𝑂̅|)2 𝑖=1 100 𝑁 ∑ | 𝑖=1 𝑃𝑖−𝑂𝑖 𝑂𝑖 𝑁 𝑂̅⁄ (3) 𝑂𝑖−𝑂̅ ) (4) 𝑆𝑂 (5) (6) | Where, Pi and Oi denote the predicted and observed values for the ith instance, respectively, while 𝑂̅ signifies the average of observed values, 𝑆𝑝 is the standard deviation of the predicted values, 𝑆𝑂 is the standard deviation of the observed values, and N stands for the sample size. r, the correlation coefficient, evaluates the fit's adequacy in mirroring a linear relationship between the predicted and actual values, focusing on variance in the samples rather than model bias. Both RMSE and NRMSE quantify the average discrepancies between predictions and observations, encompassing considerations of bias and random inaccuracies. The index of agreement, d, assesses the accuracy of model predictions, accounting for both systematic and random errors. It's fine- tuned based on the variability in observations and predictions. Notably, these metrics are computed for individual patches and juxtaposed with each patch's yield, which is determined by averaging 157 the yield values at the pixel level. 3.4. Experiments and Results In the process of enhancing the yield prediction capabilities of ML and DL models, we conducted an in-depth analysis of various Sentinel-1 data derived features, as detailed in Section 2.2.2.a. Our objective was to discern whether variations in yield rates were followed by distinct time series and signals in Sentinel-1 data. Figure 3.4 clearly depicts the mean Sentinel-1 temporal progression throughout the growth cycle of Soybean, Corn, and Winter Wheat for five different Sentinel-1 derived features. A key insight derived from Figure 3.4 is the strong correlation observed between the 𝜎𝑉𝐻 0 channel and the fluctuations in crops yield, establishing a link that is more pronounced than that observed with the 𝜎𝑉𝑉 0 channel. Notably, the signals from the 𝜎𝑉𝐻 0/𝜎𝑉𝑉 0and RVI were nearly indistinguishable, following each other closely. Conversely, the (𝜎𝑉𝑉 0-𝜎𝑉𝐻 0)/( 𝜎𝑉𝑉 0+𝜎𝑉𝐻 0) signal aligns with those of 𝜎𝑉𝐻 0/𝜎𝑉𝑉 0, and RVI, albeit inversely, and all three demonstrate a consistent relationship with the variations present in the yield map. Building on these insights, Figure 3.5-6 delve deeper, exploring the interplay between 𝜎𝑉𝐻 0 and RVI time series across a multitude of pixels within the three sample fields depicted in Figure 3.4. These figures uniquely consider two distinct resolutions, 30- and 50-meter, thereby providing a comparative perspective on crop growth cycle. Both figures consistently demonstrate that the 𝜎𝑉𝐻 0 and RVI time series have a distinct correlation with the crops' planting and harvest dates, regardless of resolution. This correlation is evident with an increase in backscatter following seeding and a decrease post-harvest for Soybean and Corn. Conversely, for Winter Wheat, which is seeded months prior and harvested in July, the 𝜎𝑉𝐻 0 and RVI features do 158 not exhibit a clear relationship. Nonetheless, the primary focus of this paper is not to delineate the apparent relationships but to delve into the non-relationship between Sentinel-1 feature and yield variations. 159 Figure 3.4: Average 30-meter Sentinel-1 signal progression across five derived features during Soybean, Corn and Winter Wheat growth cycle. 160 𝜎𝑉𝐻0 [dB] 𝜎𝑉𝑉0 [dB] 𝜎𝑉𝐻0/𝜎𝑉𝑉0[-] RVI [-] (𝜎𝑉𝑉0-𝜎𝑉𝐻0)/( 𝜎𝑉𝑉0+𝜎𝑉𝐻0) [-] Yield (kg/900m2) Soybean Corn Winter Wheat 1 13 14 15 16 17 7.5 8 8.5 9 9.5 10 10.5 5.25 5.5 5.75 6 6.25 6.5 6.75 7 0.9 0.85 0.8 0.75 0.7 0.66 0.64 0.62 0.60 0.58 0.56 0.54 1200 1000 800 600 400 200 0 Figure 3.5: Comparative analysis of 𝜎𝑉𝐻 Soybean-2020, B) Corn-2019, and C) Winter Wheat-2023 sample field. 0,and RVI time series at 30-meter resolution for A) 3.4.1. Ablation Study Given the limited size of the yield dataset in this study, we derived the dry weight of the yield irrespective of the crop type and analyzed all three crops collectively. Due to the distinct patterns displayed by the SAR time series for each crop, our focus was to examine the proficiency of DL and ML models in deciphering the non-linear correlations between yield values and SAR features across these crops. It's evident that expanding the dataset and tailoring the models to individual crops would significantly enhance accuracy, reducing bias and error in the predictions. Achieving this, however, necessitates the acquisition of a more extensive sample size. 161 Figure 3.6: Comparative analysis of 𝜎𝑉𝐻 Soybean-2020, B) Corn-2019, and C) Winter Wheat-2023 sample field. 0, and RVI time series at 50-meter resolution for A) Our methodology involved running the 3D-CNNs and three ML models, each configured for seven distinct feature combinations, to pinpoint the most predictive set for yield forecasting. We consistently included climate features (precipitation, minimum and maximum temperature) in all combinations, acknowledging their established influence on crop yield (Khaki and Wang, 2019). The feature sets ranged from four features (𝜎𝑉𝐻 0 + climate data) to nine features (𝜎𝑉𝐻 0+𝜎𝑉𝑉 0 + 𝜎𝑉𝐻 0/𝜎𝑉𝑉 0 + RVI + (𝜎𝑉𝑉 0-𝜎𝑉𝐻 0)/( 𝜎𝑉𝑉 0+𝜎𝑉𝐻 0) + incidence angle + climate), incrementally adding SAR derived features. Figure 3.7 illustrates the evolution of MAE and r values in response to increasing feature counts. 30-meter Sentinel-1 resolution was used for these analyses. Among the ML architectures, 162 XGBoost emerged as the superior performer, hence our selection for detailed analysis alongside the 3D-CNNs for the DL category. However, XGBoost displayed a consistent decline in MAE as features were added, achieving optimal performance with the full feature set. In contrast, the 3D- CNNs model's peak performance was attained with only four features, underscoring 𝜎𝑉𝐻 0's predominant influence. These findings accentuate the pivotal role of feature engineering in ML, a domain where XGBOOST resides. Unlike their DL counterparts, such as 3D-CNNs, which inherently extract and capitalize on complex data relationships, traditional ML models rely heavily on the deliberate crafting and refinement of input features to enhance predictive prowess. This reliance reaffirms the criticality of thorough feature engineering, particularly for models missing of intrinsic feature prioritization or extraction capabilities. Furthermore, the 3D-CNNs's superior performance, evidenced by a lower MAE using only four features and excluding the incidence angle, substantiates the assertion made by Garnot et al. (2022) and Gargiulo et al. (2020) regarding the capacity of DL algorithms to bypass the necessity for extensive preprocessing of SAR data. Figure 3.7: Comparative performance analysis of 3D-CNNs and XGBoost architectures highlighting the impact of feature combinations on yield prediction accuracy. 163 Upon identifying the optimal feature combination for each DL and ML model, we delved into analyzing the influence of noise in pixels located on field edges. Figure 3.2 illustrates the yield map pixels, highlighting that the marginal pixels may encompass diverse elements such as other crop fields, bare soil, or trees, potentially contributing to noise. This noise is particularly pronounced at higher spatial resolutions, such as 50 meters. Figure 3.8 depicts the yield discrepancy map, showcasing the absolute differences between the estimated yields and the actual labeled yields. This comparison is conducted at a 30-meter resolution, utilizing the ideal feature set for both 3D-CNNs and XGBoost models, and it includes every pixel spanning the fields. For this analysis, we randomly selected one sample from each crop and each year as the test dataset (22 samples). Conversely, Figure 3.9. displays the difference yield map for the same fields as those in Figure 3.8, still at a 30-meter resolution, but with a crucial distinction: it illustrates the results after mitigating the influence of edge pixels during the model training, validation, and test phases. In comparing the two figures, it becomes evident that the greatest discrepancies between estimated and actual yields predominantly occur at the field edges, particularly for Winter Wheat and Soybean crops. Hence, the subsequent sections of the paper present the final results of the DL and ML models, derived by employing this refined approach of data cleansing. This process involved the exclusion of edge pixels and the utilization of the optimal feature combination specific to each model. 164 Figure 3.8: Yield prediction discrepancy map at 30-meter resolution across three crops and eight growing seasons, depicting differences without edge filtering. 165 Figure 3.9: Yield prediction discrepancy map at 30-meter resolution for three crops over eight growing seasons, with edge pixel removal implemented. 166 3.4.2. End-of-Season Yield Estimation In this study, we employed k-fold cross-validation as a robust method to estimate crop yields, thereby enhancing the reliability and generalizability of our models. To ensure the comprehensiveness and randomness of our validation process, we meticulously selected five distinct sets of test and validation datasets. Each test dataset comprised 22 samples, which included a diverse array of crops: 8 Winter Wheat fields, 8 Soybean fields, and 6 Corn fields. This diversity allowed us to assess and validate our models' performance across various crop types and climate conditions, ensuring their efficacy and versatility in predicting yields under different agricultural conditions. Figure 3.10-11 depict a detailed breakdown of six statistical parameters, evaluated both collectively for all crops and separately for each individual crop, at 30-meter and 50-meter resolutions, respectively. For thoroughness and ease of reference, we have included the complete tables detailing these results in the supplementary materials accompanying this paper (Tables S1 and A2). Figure 3.10: Comparative performance of various DL and ML architectures at two resolutions: a) 30-meter and b) 50-meter for all crops, evaluated across six statistical parameters. Within this illustration, the units of MAE and RMSE are expressed in kg/ha, while MAPE is presented as a percentage, and R, D, and NRMSE are dimensionless. 167 The study's findings, presented in Figure 3.10, Table B1 and S2, highlight the comparative performance of three ML methods—RF, XGBoost, and SVM— and 3D-CNNs model in estimating crop yield using Sentinel-1 SAR data at two different resolutions: 30-meter and 50- meter. For all crops combined at the 30-meter resolution, XGBoost emerged as the most accurate model, with the lowest MAE (590.8 kg/ha), RMSE (874.4 kg/ha), and MAPE (9.5%), alongside the highest r (0.96) and d index (0.98), indicating robust predictive accuracy and strong agreement with observed data. Conversely, SVM performed the poorest, evidenced by its higher MAE (1508.1 kg/ha), RMSE (2114.6 kg/ha), and MAPE (24.8%), and lower r (0.74) and d index (0.85). The DL model 3D-CNNs, while not outperforming XGBoost, still showed commendable results, particularly a high d index of 0.95 and r (0.92) for all crops, suggesting a good level of agreement with actual observations, though it had a higher MAE (872 kg/ha), RMSE (1215.94 kg/ha) and MAPE (14.7) compared to XGBoost. When examining individual crops at the same resolution (Figure 3.11), XGBoost consistently showed superior performance across all metrics for Winter Wheat (MAE of 322.2 kg/ha, RMSE of 362.6 kg/ha, MAPE of 7.9 % and r of 0.72) and Soybean (340.3 kg/ha, RMSE of 419.6 kg/ha, MAPE of 8.6, and r of 0.66), while for corn, it demonstrated a higher predictive accuracy (r of 0.90) despite a slightly lower MAE (1237.3 kg/ha) and RMSE (1504.2 kg/ha) compared to 3D-CNNs. We refer readers to Tables S1 and S2 for more detailed information. The findings from this study collectively highlight the robustness of the XGBoost algorithm in predicting yields across diverse crops, with notably superior performance in estimating Winter Wheat yield. Corn, possessing a distinct structural variation compared to other crops, presents a unique challenge that ideally requires separate model training. However, due to the limited dataset, particularly for the data-intensive 3D-CNNs, the models were trained on a combined crop dataset. 168 Conversely, at the 50-meter resolution, a marked degradation in performance was observed across all models. The 3D-CNNs's metrics regressed significantly, with MAE and RMSE increasing to 1282.5 (kg/ha) and 1881.1 (kg/ha), respectively, for all crops, r decreasing to 0.83, and the d index falling to 0.90. Similarly, XGBoost experienced a downturn, with its MAE and RMSE rising to 1509.2 and 1968.5, respectively, and a reduction in r to 0.80, alongside a d index decline to 0.86. Their MAPEs also increased significantly. This decline was particularly pronounced for Winter Wheat predictions using XGBoost, where MAE escalated to 1081.4, RMSE to 1328.3, and r fell into the negative (-0.66), indicating a negative moderate correlation between predicted and actual yield. Concurrently, the d index plummeted to 0.26 and MAPE soared to 21.0 %, signaling a substantial dip in both prediction accuracy and concordance with observed data. The smaller sample size at the 50-meter resolution likely provided less information for the training process, hindering the models' ability to effectively learn and make accurate predictions. This is particularly impactful for DL models like 3D-CNNs, which require substantial data to identify and learn intricate patterns. Furthermore, the coarser resolution integrates more information from neighboring fields, introducing additional noise into the data. These noisy data can significantly distort the true reflectance and other signals from the crops of interest, thereby confounding the models and leading to higher error rates. However, the outcomes derived from the 50-meter resolution data reveal a notable advantage of 3D-CNNs over XGBoost, particularly in scenarios where the data encompasses noise. As illustrated in Figure 3.10-11, 3D-CNNs surpasses XGBoost in performance, a trend consistent not just in the aggregate analysis of all crops but also when examining each crop individually. This suggests that 3D-CNNs exhibits a higher tolerance for noise within the data, enhancing its predictive reliability under these specific conditions. 169 Figure 3.11: Comparative Performance of Various DL and ML Architectures at Two Resolutions: a) 30-meter and b) 50-meter for Corn, Evaluated Across Six Statistical Parameters. Within this illustration, the units of MAE and RMSE are expressed in kg/ha, while MAPE is presented as a percentage, and R, D, and NRMSE are dimensionless. 170 Compared to findings from similar studies utilizing 3D-CNNs and MODIS data, our research distinctly highlights the potent capability of SAR data in enhancing yield prediction accuracy. For instance, Qiao et al. (2021), utilized a 3D-CNNs model, enhanced with a multikernel learning approach and supplemented by MODIS data, achieving a MAPE of 14.35% in Winter Wheat yield prediction. In contrast, our model achieved superior MAPE results of 7.9% and 9.5% using XGBoost and 3D-CNNs, respectively. Further comparisons with works like Russello, (2018), Terliksiz & Altýlar., (2019), Gavahi et al. (2021), and Abbaszadeh et al. (2022) reveal a common trend: the application of 3D-CNNs with MODIS data for county level Soybean yield estimation, yielding RMSE of 396.8, 497, 450.6, and 450.6 kg/ha respectively and Abbaszadeh et al. (2022) reported 0.73 correlation coefficient. Our research, while aligning with the predictive accuracies of these studies, distinguishes itself through the use of SAR data, achieving an RMSE of 419.6 and 627.8 for XGBoost and 3D-CNNs, respectively, and r of 0.66 which is competitive considering the dataset size disparity. Additionally, our findings comply with those of J. Sun et al. (2019), who reported an RMSE of 329.5 using a CNN-LSTM model and MODIS data for Soybeans. Our model's comparable RMSE of 419.6 and 627.8 using XGboost and 3D-CNNs values indicate a consistent predictive capability, even with the shift from MODIS to SAR data. 3.4.3. In-Season Yield Estimation To determine the earliest point at which yield can be estimated without significant loss of accuracy, we systematically reduced the number of Sentinel-1 images used in our models. Figure 3.12 illustrates the MAE and r values in response to this gradual reduction for both the 3D-CNNs and the superior ML model, XGBoost. Our analysis commenced from the harvest time (end of October for Corn and Soybean and end of July for Winter Wheat), identified by the lowest MAE 171 and highest r. The figure shows that up to 36 days prior to harvest, r remains relatively stable, and the MAE sees a marginal rise of 44 kg/ha (7.5%) and 82 kg/ha (9.5%) for all crops using XGBoost and 3D-CNNs, respectively. Beyond this point, there is a marked acceleration in MAE, indicating diminishing returns in predictive accuracy. The fewer available Sentinel-1 images for Winter Wheat as we move further from harvest time could also contribute to the increased MAE. Our in-season yield estimation results are in harmony with J. Sun et al. (2019) reported in- season Soybean yield estimation using MODIS and CNN-LSTM on August 21 (two month before harvest) with just losing 7% accuracy compared to harvest estimated yield. Figure 3.12: Analysis of yield prediction accuracy over time, depicting MAE and R metrics for 3D-CNNs and XGBoost models as the number of Sentinel-1 Images decreases, highlighting the stability of predictions up to 36 days before harvest. 3.5. Discussion 3.5.1. Impact of feature combination on DL and ML models performance Our ablation study revealed a notable divergence in performance between the 3D-CNNs and XGBoost models in yield prediction using SAR features. Specifically, the 3D-CNNs model achieved optimal performance with a subset of only four features, while the XGBoost model 172 required the full set of nine features to realize its best results. This disparity underscores the differing operational dynamics and feature sensitivities inherent to these two distinct model architectures. The 3D-CNNs model, known for its capacity to automatically extract spatial and temporal features, demonstrated a pronounced efficiency in handling complex feature interactions with a reduced feature set. This suggests that the most critical information relevant for yield prediction was effectively captured by these four key features (VH channel, precipitation, minimum and maximum temperature), highlighting the model's ability to distill and prioritize feature importance. Conversely, the XGBoost model's reliance on the complete feature set indicates a different interaction with the feature space, where the collective contribution of all features was necessary to maximize predictive performance. The results also bring to the forefront the critical role of feature engineering in ML, particularly for models like XGBoost. Unlike DL models such as 3D-CNNs, which are designed to autonomously discern and leverage intricate data patterns, traditional ML algorithms often hinge on the careful selection and optimization of input features to bolster their predictive accuracy. Therefore, the ablation study underscored the critical role of the VH channel in comparison to other SAR features, a finding that aligns with the research presented by Sharma et al. (2022). Their study demonstrated that an artificial neural network (ANN) model relying on VH polarization surpassed its VV polarization-based counterpart in terms of accuracy in paddy rice yield estimation. Guo et al. (2022) also underscored the significance of pinpointing the optimal blend of SAR features through Jeffries-Matusita (J-M) distance analysis to enhance crop classification accuracy using DL methods. This not only highlights the prowess of DL in feature mining but also underscores its utility in leveraging SAR time series data for effective yield forecasting. 173 3.5.2. Machine Learning versus Deep Learning: Assessing Yield Prediction Capabilities The results highlight the subtle performance differences between ML algorithms and DL methods in yield estimation using SAR data at different resolutions. XGBoost's superior performance at the 30-meter resolution across almost all metrics underscores its effectiveness in handling complex, high-dimensional data, likely due to its sophisticated ensemble learning structure. In contrast, while the 3D-CNNs model didn't lead in performance, its relatively high d values indicate a respectable level of predictive reliability, an expected strength of DL models owing to their capacity for hierarchical feature learning from data. However, it is important here to highlight the impact of dataset size on DL model performance. DL models, particularly those employing convolutional neural networks, are inherently data-hungry. They thrive on large, diverse datasets from which they can extract intricate patterns and features at multiple levels of abstraction. However, in this study, the relatively limited size of the dataset, especially when considering individual crop types, posed a significant constraint. The necessity to pool data from Corn, Soybean, and Winter Wheat for the DL model was primarily due to insufficient individual crop data, preventing the model from being trained separately on each crop type. This merging approach, while practical, introduces an inherent challenge: it requires the model to generalize across crops with distinct growth behaviors and yield patterns, potentially obscuring crop-specific nuances essential for precise yield prediction. Moreover, the use of transfer learning, though a strategic choice given the dataset's constraints, further emphasizes the model's dependency on pre-existing knowledge extracted from different tasks (crop classification in this case). While this technique aids in model convergence and prevents overfitting in the face of limited data, it may also carry over biases from the pre- trained task, affecting the model's ability to learn crop-specific yield determinants effectively. 174 Looking ahead, there's a compelling case for expanding the dataset—both in terms of the number of samples per crop and the diversity of conditions represented (e.g., irrigated and non-irrigated crops). By increasing the volume and variety of training data, the DL model's capacity for feature extraction and complex pattern recognition can be more fully leveraged. This, in turn, is anticipated to enhance the model's predictive accuracy, as reflected in potential reductions in MAE and RMSE, and improve its crop-specific yield estimation capabilities. Furthermore, with a more substantial and crop-diverse dataset, future work could explore more nuanced DL architectures and training strategies, potentially developing separate models for each crop type or employing multi-task learning approaches that can simultaneously learn across several related tasks. Such advancements, grounded in richer datasets, hold promise not only for improving yield predictions but also for providing more granular insights into the factors influencing crop yields, ultimately contributing to more informed agricultural decision-making and resource allocation. 3.5.3. Balancing SAR Image Resolution and Yield Prediction Accuracy The decision to employ a 50-meter resolution for SAR images was initially based on the assumption that aggregating SAR data would mitigate the impact of speckle noise, a common challenge in SAR imagery. It was hypothesized that this aggregation might enhance the clarity of the signals, thereby improving the predictive performance of the models. However, the outcomes indicated a contrary effect. The coarser 50-meter resolution led to a significant reduction in the number of distinct samples available for analysis, as detailed in Table 3.3, which inadvertently constrained the models' learning capabilities due to less granular data. Moreover, the aggregation process introduced an unexpected complication: the incorporation of additional noise from adjacent fields (Figure 3.2) This extraneous information, rather than providing more accurate 175 reflectance and crop condition data, introduced variables that confounded the models, overshadowing the benefits anticipated from speckle reduction. The resultant increase in prediction errors underscores the delicate balance required in choosing an appropriate data resolution, where the intended benefits of noise reduction must be carefully weighed against the potential drawbacks of data aggregation, such as diminished sample detail and the introduction of irrelevant noise from the surrounding environment. This insight highlights the nuanced complexities involved in optimizing data preprocessing strategies for ML and DL applications in precision agriculture. In this study, the 3D-CNNs has showcased its superiority over traditional ML by its ability to sustain performance in dealing with the noise in the SAR observations. Firstly, it achieved superior performance using only four features, eliminating the need for the incidence angle as an input feature, a stark contrast to ML methods that required all nine features, including the incidence angle, for optimal functionality. These findings reinforce the assertions made by Garnot et al. (2022) and Gargiulo et al. (2020) about the adeptness of DL algorithms in circumventing extensive preprocessing of SAR data for speckle filtering. Secondly, despite the limited dataset at the 50- meter resolution and the inherent data-intensive nature of 3D-CNNs, it still surpassed ML methods in filtering out noise from adjacent fields. 3.6. Conclusion This research embarked on a comprehensive exploration of the capabilities of DL models, specifically 3D-CNNs, in enhancing yield prediction accuracy using multi-temporal Sentinel-1 SAR data. This study conducted a thorough analysis of SAR data features, uncovering crucial insights into how specific features correlate with variations in crop yield. Notably, the VH channel demonstrated a particularly robust relationship with agricultural yield predictions. 176 The comparative analysis between traditional ML models and the patch-based 3D-CNNs highlighted the nuanced strengths of each approach. While XGBoost, an ML model, demonstrated robust performance across various metrics, particularly at the 30-meter resolution, it required a comprehensive set of features to achieve optimal results. In contrast, the 3D-CNNs model, despite the dataset's limited size, not only performed commendably with a reduced feature set but also displayed remarkable resilience against noise, particularly at the coarser 50-meter resolution. This resilience is particularly noteworthy, as it suggests an inherent capability of DL models to maintain performance integrity even when external factors introduce data inconsistencies. Furthermore, the study's findings emphasize the critical role of feature engineering in the realm of ML, where the deliberate selection and optimization of input features significantly influence model performance. On the other hand, DL models, like 3D-CNNs, exhibit an intrinsic ability to autonomously extract and capitalize on complex data relationships, a quality that traditional ML models lack. In terms of practical applications, the models' ability to predict yields with reasonable accuracy almost one month before harvest presents significant implications for agricultural planning and resource allocation. The models' performance also points to the potential benefits of expanding the dataset and tailoring models to individual crops, which could further enhance prediction accuracy and reduce biases. In conclusion, this study illustrates the promising potential of ML and DL in agricultural yield prediction. By harnessing the rich data provided by Sentinel-1 SAR imagery and leveraging the advanced analytical capabilities of these models, stakeholders in the agricultural sector can gain unprecedented insights into crop performance, ultimately contributing to enhanced food security and resource optimization. However, the research and development doesn't end here; the 177 findings also pave the way for future research, particularly in expanding and diversifying datasets and exploring different types of DL architectures using SAR imagery to estimate intra-season crop yield from field to county scale. 178 REFERENCES Abbaszadeh, P., Gavahi, K., Alipour, A., Deb, P., & Moradkhani, H. (2022). Bayesian multi- modeling of deep neural nets for probabilistic crop yield prediction. Agricultural and Forest Meteorology, 314, 108773. Abhishek, A., Phanikumar, M. S., Sendrowski, A., Andreadis, K. M., Hashemi, M. G. Z., Jayasinghe, S., Prasad, P. V. V., Brent, R. J., & Das, N. N. (2023). Dryspells and Minimum Air Temperatures Influence Rice Yields and their Forecast Uncertainties in Rainfed Systems. Agricultural and Forest Meteorology, 341, 109683. Alebele, Y., Wang, W., Yu, W., Zhang, X., Yao, X., Tian, Y., Zhu, Y., Cao, W., & Cheng, T. (2021). Estimation of crop yield from combined optical and SAR imagery using Gaussian kernel regression. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14, 10520–10534. Battude, M., Al Bitar, A., Morin, D., Cros, J., Huc, M., Sicre, C. M., Le Dantec, V., & Demarez, V. (2016). Estimating maize biomass and yield over large areas using high spatial and temporal resolution Sentinel-2 like remote sensing data. Remote Sensing of Environment, 184, 668–681. Bolton, D. K., & Friedl, M. A. (2013). Forecasting crop yield using remotely sensed vegetation indices and crop phenology metrics. Agricultural and Forest Meteorology, 173, 74–84. Boryan, C., Yang, Z., Mueller, R., & Craig, M. (2011). Monitoring US agriculture: the US department of agriculture, national agricultural statistics service, cropland data layer program. Geocarto International, 26(5), 341–358. Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 785–794. Clauss, K., Ottinger, M., Leinenkugel, P., & Kuenzer, C. (2018). Estimating rice production in the Mekong Delta, Vietnam, utilizing time series of Sentinel-1 SAR data. International Journal of Applied Earth Observation and Geoinformation, 73, 574–585. Cosh, M. H., White, W. A., Colliander, A., Jackson, T. J., Prueger, J. H., Hornbuckle, B. K., Hunt, E. R., McNairn, H., Powers, J., Walker, V. A., & Bullock, P. (2019). Estimating vegetation water content during the Soil Moisture Active Passive Validation Experiment 2016. Journal of Applied Remote Sensing, 13(01), 1. https://doi.org/10.1117/1.JRS.13.014516 Cué La Rosa, L. E., Queiroz Feitosa, R., Nigri Happ, P., Del’Arco Sanches, I., & Ostwald Pedro da Costa, G. A. (2019). Combining deep learning and prior knowledge for crop mapping in tropical regions from multitemporal SAR image sequences. Remote Sensing, 11(17), 2029. Dadhwal, V. K. (2003). Crop growth and productivity monitoring and simulation using remote sensing and GIS. Satellite Remote Sensing and GIS Applications in Agricultural Meteorology, 263–289. 179 Dang, C., Liu, Y., Yue, H., Qian, J., & Zhu, R. (2021). Autumn crop yield prediction using data- driven approaches:-support vector machines, random forest, and deep neural network methods. Canadian Journal of Remote Sensing, 47(2), 162–181. den Besten, N., Dunne, S. S., Mahmud, A., Jackson, D., Aouizerats, B., de Jeu, R., Burger, R., Houborg, R., McGlinchey, M., & van der Zaag, P. (2023). Understanding Sentinel-1 backscatter response to sugarcane yield variability and waterlogging. Remote Sensing of Environment, 290, 113555. El-Hajj, M., Baghdadi, N., Cheviron, B., Belaud, G., & Zribi, M. (2016). Integration of remote sensing derived parameters in crop models: application to the PILOTE model for hay production. Agricultural Water Management, 176, 67–79. Fernandez-Beltran, R., Baidar, T., Kang, J., & Pla, F. (2021). Rice-yield prediction with multi- temporal sentinel-2 data and 3D CNN: A case study in Nepal. Remote Sensing, 13(7), 1391. Fontanelli, G., Lapini, A., Santurri, L., Pettinato, S., Santi, E., Ramat, G., Pilia, S., Baroni, F., Tapete, D., & Cigna, F. (2022). Early-Season Crop Mapping on an Agricultural Area in Italy Using X-Band Dual-Polarization SAR Satellite Data and Convolutional Neural Networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15, 6789–6803. Gargiulo, M., Dell’Aglio, D. A. G., Iodice, A., Riccio, D., & Ruello, G. (2020). Integration of sentinel-1 and sentinel-2 data for land cover mapping using w-net. Sensors, 20(10), 2969. Garnot, V. S. F., Landrieu, L., & Chehata, N. (2022). Multi-modal temporal attention models for crop mapping from satellite time series. ISPRS Journal of Photogrammetry and Remote Sensing, 187, 294–305. Gavahi, K., Abbaszadeh, P., & Moradkhani, H. (2021). DeepYield: A combined convolutional neural network with long short-term memory for crop yield forecasting. Expert Systems with Applications, 184, 115511. Guo, Z., Qi, W., Huang, Y., Zhao, J., Yang, H., Koo, V.-C., & Li, N. (2022). Identification of crop type based on C-AENN using time series Sentinel-1A SAR data. Remote Sensing, 14(6), 1379. Hashemi, M. G. Z., Abhishek, A., Jalilvand, E., Jayasinghe, S., Andreadis, K. M., Siqueira, P., & Das, N. N. (2022). Assessing the impact of Sentinel-1 derived planting dates on rice crop yield modeling. International Journal of Applied Earth Observation and Geoinformation, 114, 103047. https://doi.org/https://doi.org/10.1016/j.jag.2022.103047 Hashemi, M. G. Z., Jalilvand, E., Alemohammad, H., Tan, P.-N., & Das, N. N. (2023). A Systematic Review of Synthetic Aperture Radar and Deep Learning in Agricultural Applications (Under Review). ISPRS Journal of Photogrammetry and Remote Sensing. Hoekman, D. H., & Bouman, B. A. M. (1993). Interpretation of C-and X-band radar images over an agricultural area, the Flevoland test site in the Agriscatt-87 campaign. TitleREMOTE SENSING, 14(8), 1577–1594. 180 Jones, J. W., Hoogenboom, G., Porter, C. H., Boote, K. J., Batchelor, W. D., Hunt, L. A., Wilkens, P. W., Singh, U., Gijsman, A. J., & Ritchie, J. T. (2003). The DSSAT cropping system model. European Journal of Agronomy, 18(3–4), 235–265. Judge, J., Liu, P.-W., Monsiváis-Huertero, A., Bongiovanni, T., Chakrabarti, S., Steele-Dunne, S. C., Preston, D., Allen, S., Bermejo, J. P., Rush, P., DeRoo, R., Colliander, A., & Cosh, M. (2021). Impact of vegetation water content information on soil moisture retrievals in agricultural regions: An analysis based on the SMAPVEX16-MicroWEX dataset. Remote Sensing of Environment, 265, 112623. https://doi.org/https://doi.org/10.1016/j.rse.2021.112623 Kamilaris, A., & Prenafeta-Boldú, F. X. (2018). A review of the use of convolutional neural networks in agriculture. The Journal of Agricultural Science, 156(3), 312–322. Khaki, S., & Wang, L. (2019). Crop yield prediction using deep neural networks. Frontiers in Plant Science, 10, 621. Kussul, N., Lavreniuk, M., Skakun, S., & Shelestov, A. (2017). Deep learning classification of land cover and crop types using remote sensing data. IEEE Geoscience and Remote Sensing Letters, 14(5), 778–782. Leroux, L., Castets, M., Baron, C., Escorihuela, M.-J., Bégué, A., & Seen, D. Lo. (2019). Maize yield estimation in West Africa from crop process-induced combinations of multi-domain remote sensing indices. European Journal of Agronomy, 108, 11–26. Luck, J. D., & Fulton, J. P. (2014). Precision agriculture: Best management practices for collecting accurate yield data and avoiding errors during harvest. University of Nebraska-Lincoln, Extension. McDonald, A. J., Bennett, J. C., Cookmartin, G., Crossley, S., Morrison, K., & Quegan, S. (2000). The effect of leaf geometry on the microwave backscatter from leaves. International Journal of Remote Sensing, 21(2), 395–400. McNairn, H., Kross, A., Lapen, D., Caves, R., & Shang, J. (2014). Early season monitoring of corn and soybeans with TerraSAR-X and RADARSAT-2. International Journal of Applied Earth Observation and Geoinformation, 28, 252–259. Muruganantham, P., Wibowo, S., Grandhi, S., Samrat, N. H., & Islam, N. (2022). A systematic literature review on crop yield prediction with deep learning and remote sensing. Remote Sensing, 14(9), 1990. Nana, E., Corbari, C., & Bocchiola, D. (2014). A model for crop yield and water footprint assessment: Study of maize in the Po valley. Agricultural Systems, 127, 139–149. Nasirzadehdizaji, R., Balik Sanli, F., Abdikan, S., Cakir, Z., Sekertekin, A., & Ustuner, M. (2019). Sensitivity analysis of multi-temporal Sentinel-1 SAR parameters to crop height and canopy coverage. Applied Sciences, 9(4), 655. Oquab, M., Bottou, L., Laptev, I., & Sivic, J. (2014). Learning and transferring mid-level image representations using convolutional neural networks. Proceedings of the IEEE Conference on 181 Computer Vision and Pattern Recognition, 1717–1724. Qiao, M., He, X., Cheng, X., Li, P., Luo, H., Tian, Z., & Guo, H. (2021). Exploiting hierarchical features for crop yield prediction based on 3-D convolutional neural networks and multikernel Gaussian process. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14, 4476–4489. Russello, H. (2018). Convolutional neural networks for crop yield prediction using satellite images. IBM Center for Advanced Studies. Sharma, P. K., Kumar, P., Srivastava, H. S., & Sivasankar, T. (2022). Assessing the potentials of multi-temporal sentinel-1 SAR data for paddy yield forecasting using artificial neural network. Journal of the Indian Society of Remote Sensing, 50(5), 895–907. Skakun, S., Kussul, N., Shelestov, A. Y., Lavreniuk, M., & Kussul, O. (2015). Efficiency assessment of multitemporal C-band Radarsat-2 intensity and Landsat-8 surface reflectance satellite imagery for crop classification in Ukraine. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(8), 3712–3719. Soudani, K., le Maire, G., Dufrêne, E., François, C., Delpierre, N., Ulrich, E., & Cecchini, S. (2008). Evaluation of the onset of green-up in temperate deciduous broadleaf forests derived from Moderate Resolution Imaging Spectroradiometer (MODIS) data. Remote Sensing of Environment, 112(5), 2643–2655. Steele-Dunne, S. C., McNairn, H., Monsivais-Huertero, A., Judge, J., Liu, P.-W., & Papathanassiou, K. (2017). Radar remote sensing of agricultural canopies: A review. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 10(5), 2249–2273. Sun, J., Di, L., Sun, Z., Shen, Y., & Lai, Z. (2019). County-level soybean yield prediction using deep CNN-LSTM model. Sensors, 19(20), 4363. Teimouri, M., Mokhtarzade, M., Baghdadi, N., & Heipke, C. (2022). Fusion of time-series optical and SAR images using 3D convolutional neural networks for crop classification. Geocarto International, 1–18. Terliksiz, A. S., & Altýlar, D. T. (2019). Use of deep neural networks for crop yield prediction: A case study of soybean yield in lauderdale county, alabama, usa. 2019 8th International Conference on Agro-Geoinformatics (Agro-Geoinformatics), 1–4. Thorp, K. R., & Drajat, D. (2021). Deep machine learning with Sentinel satellite data to map paddy rice production stages across West Java, Indonesia. Remote Sensing of Environment, 265, 112679. Togliatti, K., Lewis-Beck, C., Walker, V. A., Hartman, T., VanLoocke, A., Cosh, M. H., & Hornbuckle, B. K. (2022). Quantitative Assessment of Satellite L-Band Vegetation Optical Depth in the U.S. Corn Belt. IEEE Geoscience and Remote Sensing Letters, 19, 1–5. https://doi.org/10.1109/LGRS.2020.3034174 Yang, H., Pan, B., Li, N., Wang, W., Zhang, J., & Zhang, X. (2021). A systematic method for 182 spatio-temporal phenology estimation of paddy rice using time series Sentinel-1 images. Remote Sensing of Environment, 259, 112394. Yang, Q., Shi, L., & Lin, L. (2019). Plot-scale rice grain yield estimation using UAV-based remotely sensed images via CNN with time-invariant deep features decomposition. IGARSS 2019- 2019 IEEE International Geoscience and Remote Sensing Symposium, 7180–7183. Zhao, W., Qu, Y., Zhang, L., & Li, K. (2022). Spatial-aware SAR-optical time-series deep integration for crop phenology tracking. Remote Sensing of Environment, 276, 113046. Zhong, L., Hu, L., & Zhou, H. (2019). Deep learning based multi-temporal crop classification. 430–443. Remote https://doi.org/https://doi.org/10.1016/j.rse.2018.11.032 Environment, Sensing 221, of 183 Table B1: Simulation results using 30-meter Sentinel-1 resolution. APPENDIX All Crops 30-meter resolution test MAE [kg/ha] RMSE [kg/ha] R [-] NRMSE [-] MAPE [%] D [-] 3D-CNNs RF XGBoost 871.62 708.25 590.80 SVM 1508.09 1215.94 1025.22 874.37 2114.61 0.92 0.94 0.96 0.74 0.11 0.09 0.08 0.20 14.65 9.32 9.53 24.84 0.95 0.97 0.98 0.85 Winter Wheat 30-meter resolution test MAE [kg/ha] RMSE [kg/ha] R [-] NRMSE [-] MAPE [%] D [-] 3D-CNNs XGBoost 550.22 322.23 643.91 362.56 0.60 0.72 0.41 0.16 9.48 7.88 0.69 0.91 Soybean 30-meter resolution test MAE [kg/ha] RMSE [kg/ha] R [-] NRMSE [-] MAPE [%] D [-] 3D-CNNs XGBoost 594.96 340.32 627.83 419.63 0.66 0.66 0.35 0.19 15.79 8.60 0.78 0.92 Corn 30-meter resolution test MAE [kg/ha] RMSE [kg/ha] R [-] NRMSE [-] MAPE [%] D [-] 3D-CNNs XGBoost 1657.69 1237.26 1980.30 1504.18 0.67 0.90 0.34 0.26 18.718 13.95 0.77 0.91 184 Table B2: Simulation results using 50-meter Sentinel-1 resolution. All crops 50-meter resolution test MAE [kg/ha] RMSE [kg/ha] R [-] NRMSE [-] MAPE [%] D [-] 3D-CNNs RF XGBoost SVM 1282.51 1583.73 1509.22 1802.46 1881.11 2037.92 1968.48 2272.46 0.83 0.83 0.80 0.75 0.16 0.18 0.18 0.19 31.10 45.06 36.70 47.14 0.90 0.86 0.86 0.82 Winter Wheat 50-meter resolution test MAE [kg/ha] RMSE [kg/ha] R [-] NRMSE [-] MAPE [%] D [-] 3D-CNNs 685.07 XGBoost 1081.36 859.11 1328.27 0.40 -0.66 0.56 0.54 11.76 21.00 0.58 0.26 Soybean 50-meter resolution test MAE [kg/ha] RMSE [kg/ha] R [-] NRMSE [-] MAPE [%] D [-] 3D-CNNs 635.08 XGBoost 1068.86 934.33 1412.59 0.66 0.36 0.29 0.40 31.04 48.00 0.73 0.58 Corn 50-meter resolution test MAE [kg/ha] RMSE [kg/ha] R [-] NRMSE [-] MAPE [%] D [-] 3D-CNNs XGBoost 1289.15 2598.18 1458.59 2958.28 0.73 0.92 0.30 0.55 12.22 17.00 0.81 0.75 185 4. CHAPTER 4: ESTIMATING CROP BIOPHYSICAL PARAMETERS USING SELF- SUPERVISED LEARNING WITH FOUNDATION MODELS AND SAR-OPTICAL OBSERVATIONS 186 4.1. Introduction The estimation of Vegetation Water Content (VWC) is of crucial importance in various environmental and agricultural applications, as well as for satellite-based remote sensing retrieval algorithms. As a critical biophysical parameter, VWC significantly influences agricultural land monitoring, yield forecasting, soil moisture (SM) retrieval, and wildland fire risk assessment (Fensholt and Sandholt, 2003). Accurate VWC estimation is vital for providing real-time insights into crop status, enabling farmers to optimize timing for agricultural interventions, thereby maximizing yields and resource efficiency (Khabbazan et al., 2022). Additionally, VWC is crucial in understanding plant productivity and crop growth, as these are largely dependent on water availability within the root zone (Friesen et al., 2012). It also plays a key role in regulating evapotranspiration, accounting for 60% of water returned to the atmosphere by vegetation (Oki and Kanae, 2006). For SM retrievals from microwave remote sensing (RS) observations, VWC, along with surface roughness (SR), is essential in models like the zeroth-order single-scattering Tau-Omega model (Jackson et al., 1982), the single-scattering Water Cloud Model (WCM) (Attema and Ulaby, 1978), and the Bayesian model for microwave emission and scattering (Pierdicca et al., 2010). Therefore, the accuracy of VWC estimates is crucial in reducing uncertainties in these model-based retrievals, which directly impacts the reliability of SM and Vegetation Optical Depth (VOD) measurements (Judge et al., 2021). VWC in a simple word is volume of water content in vegetation with unit in kilogram per square meter (Hunt et al., 2011). VWC can be categorized in three different scales, leaf, plant, and canopy scales (Hunt Jr et al., 2018). Leaf VWC is the water mass per leaf unit area (Ceccato et al., 2001; Jacquemoud et al., 2009), while the ratio of water content weight to the total plant weight is defined as the plant VWC. Finally, the canopy VWC can be determined by measuring the water 187 mass of vegetation per ground area. VWC measurement, traditionally reliant on labor-intensive field sampling (Vermunt et al., 2021; Ye et al., 2021), has evolved with the advent of satellite observations, particularly multi- spectral and Microwave sensors operating in C-band and higher frequencies (Steele-Dunne et al., 2017). Optical and thermal indices like Normalized Difference Vegetation Index (NDVI), and Normalized Different Water Index (NDWI) are effective in low vegetation conditions, especially when VWC is below 4 kg/m² (Cosh et al., 2019) and they have been widely used to estimate VWC (Cosh et al., 2019; Gao et al., 2015; Hunt et al., 2011; Jackson et al., 2004; Judge et al., 2021; Michael H. Cosh, 2010). However, the multi-spectral sensors are adversely affected by the presence of clouds, and they also get affected by background, aerosol, and saturation in high biomass regions (Soudani et al., 2008). Moreover, they are less sensitive to the water beneath the canopy surface, such as in the stem and ears which can lead to underestimation of plant water and saturation at different points in the growing season (Judge et al., 2021; Togliatti et al., 2022). Among optical-thermal vegetation indices, NDVI, popular for VWC estimation (Jackson et al., 2004), is limited by its focus on chlorophyll rather than water content (Chen et al., 2005; Huang et al., 2009). NDWI, on the other hand targeting water absorption in shortwave infrared bands (Hunt Jr et al., 2018; Wang et al., 2013), show stronger correlations with crop water content (Jackson et al., 2004). Huang et al. (2009) demonstrated that NDWI outperforms NDVI in estimating VWC for Corn and Soybeans. Building on this, Xu et al. (2020) investigated high-resolution VWC estimation using linear regression with NDWI, achieving R² values of 0.44 to 0.85 for corn and soybean plants and canopies in Iowa. Their study focused on a brief period of crop reproductive growth in August, likely benefiting from clear sky conditions and increased availability of optical imagery. Further supporting NDWI's efficacy, Cosh et al. (2019) and Judge et al. (2021) showed 188 that NDWI-based regression models for VWC estimation result in lower Root Mean Square Deviation (RMSD) for SM retrieval compared to NDVI-based approaches. Recent advancements in agricultural monitoring have highlighted the efficacy of low- frequency Synthetic Aperture Radar (SAR) data, particularly in crop monitoring, SM estimation, and biophysical parameters extraction. Studies have consistently demonstrated the sensitivity of radar backscatter intensity to various crop biophysical parameters, with a notable correlation between backscatter coefficients, Radar Vegetation Indices (RVI), and VWC (Huang et al., 2016; Liu et al., 2013; Vreugdenhil et al., 2018; Yihyun Kim et al., 2014, 2012). The sensitivity of backscatter to dielectric properties of vegetation is a clue of strong relationship between crop water content and backscatter coefficient (Konings et al., 2019; Steele-Dunne et al., 2017). SAR indices, however, are influenced by factors such as leaf size, canopy structure, SM, SR and surface canopy water (SCW), which can affect their accuracy (Judge et al., 2021; Khabbazan et al., 2022; Kim et al., 2011; Vermunt et al., 2022) The effectiveness of SAR in agricultural applications is further nuanced by the choice of frequency (Hashemi et al., 2024; Steele-Dunne et al., 2017). C-band SAR signals, with their shorter wavelengths compared to L-band, interact more with smaller vegetation elements like leaves and small stems, making them suitable for discriminating herbaceous crops such as wheat, alfalfa, and canola, even at moderate growth stages. In contrast, L-band SAR signals, having longer wavelengths, are less affected by the upper canopy layers and interact more with intermediate-sized crop elements like stems and leaf ribs of wide-leaf crops such as corn and sunflower (Ferrazzoli et al., 1997). Kim et al. (2018) observed a lower correlation between VWC and L-band radar backscatter for less dense, herbaceous crops like winter wheat and soybean compared to thicker crops like corn, attributing this to the higher influence of SM and SR in shorter, less dense crops. Moreover, 189 Huang et al. (2016) and Hosseini et al. (2015), revealed that spatial variations in vegetation and surface conditions can substantially affect these correlations. This issue is more pronounced in airborne and satellite observations compared to scatterometer L-band SAR data, highlighting the necessity for more sophisticated models that incorporate these varying environmental variables. Further, Judge et al. (2021) advanced this understanding by developing multiple linear regressions to estimate VWC using both Visible to Shortwave Infrared (VSWIR) and radar vegetation indices. Their results demonstrated that near-real-time NDWI and L-band Circularly Polarized Ratio (CRvv)-derived VWC could enhance the accuracy of the SMAP single channel retrieval algorithm in SM retrieval. They highlighted the dynamic influence of soil characteristics and vegetation on radar backscatter across different growth stages of corn in Iowa. Recent studies by Khabbazan et al. (2022) and Vermunt et al. (2022) explored the relation between VWC, SCW and L-band SAR observables. The findings of Khabbazan et al. (2022) indicated that SCW can potentially increase radar backscatter up to 2 dB. However, they noted that this effect was less significant in the case of cross ratio and RVI. Complementing this, Vermunt et al. (2022) developed a multiple linear regression model incorporating VWC, SCW, and SM to examine the influence of VWC on L-band backscatter. Their findings revealed that internal VWC can fluctuate by 10%-20% throughout the day under non-stressed conditions but it can increase to 35% under stress. Previous studies on VWC estimation using SAR predominantly employed a linear regression or the WCM (Attema and Ulaby, 1978) with limited labeled VWC datasets. However, the application of WCM encounters several challenges. It requires labor-intensive calibration of parameters like vegetation attenuation (Kumar et al., 2012), tends to oversimplify heterogeneous canopies (Khabbazan et al., 2019), and operates under the assumption that vegetation and ground 190 scattering are independent—an assumption that may not be valid in dense, multi-layered vegetative areas. This study aims to address this limitation by employing foundation models (FMs), with VWC estimation as the downstream task. FMs are large-scale models pre-trained on vast, unlabeled datasets using self-supervision. They are designed to be generalist, understanding a wide range of inputs and tasks. These models can then be fine-tuned on smaller, task-specific datasets. Their architecture, often based on transformers, allows them to excel in multiple domains, from natural language processing to vision tasks, by adjusting the model's head for specific downstream tasks (Jakubik et al., 2023). In this chapter we propose a first-of-its-kind framework for the creation of geospatial FMs to accelerate the development and deployment of crop biophysical parameters estimation. The scarcity problem of ground reference for crop monitoring application can be resolved using FMs by self-supervision technique without the need for labeled dataset and then fine-tuning the pretrained model for downstream tasks with small, labeled dataset. Furthermore, FMs can address the limitations of the WCM by capturing complex relationships between SAR and VSWIR data, SM, SR, and VWC. In recent years, researchers have applied geospatial FMs incorporating various Self-Supervised Learning (SSL) techniques to remote sensing (RS) tasks. These techniques include contrastive learning, Masked Autoencoder (MAE), Masked Image Modeling (MIM), self-DIstillation with NO labels (DINO), Bootstrap Your Own Latent (BYOL), Momentum Contrast (MOCO), Contrastive Aggregated Contrastive (CACo), and Seasonal Contrast (SeCo). These approaches, often implemented with transformer or vision transformer (ViT) architectures, have been adapted for diverse RS data types such as SAR, optical/thermal, and LiDAR for image/scene classification (Ayush et al., 2021; Bastani et al., 2023; Muhtar et al., 2023; Prexl and Schmitt, 2023; Tsaris et al., 2024), semantic segmentation (Bountos et al., 2023; Fuller et al., 2023; Jain et al., 2022; Wang 191 et al., 2023; Y. Wang et al., 2022), object detection (Li et al., 2021; D. Wang et al., 2022; Zhang et al., 2022), change detection (Mall et al., 2023; Manas et al., 2021), land cover classification (Prexl and Schmitt, 2023; D. Wang et al., 2022a; Zhang et al., 2022), and crop mapping (Xu et al., 2024). To illustrate the effectiveness of these approaches, Scheibenreif et al., (2022) demonstrated that SSL techniques combined with ViT (Dosovitskiy, 2020) architectures have outperformed ConvNet in classification and segmentation tasks. Their study revealed significant performance improvements, with gains of up to 30% over supervised baselines across various downstream tasks when finetuned with small, labeled data. This study pioneers the application of geospatial FMs and advanced machine learning (ML) methods, including random forest (RF) and XGBoost (XGB), for fine-scale (50-meter) VWC estimation. By leveraging freely available Sentinel-1A C-band and Sentinel-2 Visible and Shortwave Infrared (VSWIR) data, our research demonstrates the significant potential of ML and deep learning (DL) in enhancing SAR and VSWIR-based VWC estimation. This work sets a precedent for future studies on the use of FMs in crop monitoring and agricultural practice management, particularly addressing the challenge of limited labeled data. Our comprehensive case study encompasses a diverse range of climatic conditions, from the humid continental climate of Iowa to the temperate climate of Michigan and the humid subtropical climate of Florida. This diversity extends to field conditions and practice management, including both irrigated and non-irrigated fields, as well as varying tillage practices such as fields with crop residue and those fully plowed. The breadth of our study settings significantly enhances the applicability and relevance of our findings across different agricultural contexts. Furthermore, we evaluate the potential of multi-task learning (MTL), simultaneously estimating crop height and VWC, to enhance overall prediction accuracy. The rationale for using MTL stems from the 192 intrinsic relationship between crop height and VWC. Crop height can be indicative of plant health and biomass, which are closely related to the water content of the vegetation. Accurate estimates of crop height can thus provide contextual information that enhances the prediction of VWC, particularly with small labeled datasets. By addressing these critical aspects and leveraging advanced modeling techniques, our study not only advances the technical capabilities of VWC and crop height estimation but also opens new avenues for accurate, timely, and scalable crop monitoring. This comprehensive approach has the potential to transform agricultural decision-making processes, offering farmers and agronomists more reliable data for optimizing resource management and improving crop yields across diverse agricultural landscapes. 4.2. Measurements and Case Study: 4.2.1. SAR and Optical Data Sources Our study leverages a comprehensive suite of RS data to address the challenges of crop biophysical parameters estimation. We strategically combined SAR and optical data to exploit their complementary strengths, focusing on freely available and regularly acquired satellite observations. 4.2.1.1. Sentinel-1A SAR Data For our study, we leveraged Sentinel-1A C-band SAR data, chosen for its advantageous combination of regular global coverage, open data policy, and practical applicability to agricultural monitoring. We downloaded ascending Radiometrically Terrain Corrected (RTC) Sentinel-1A images with 30-meter resolution, spanning 2016, 2018, and 2022-2023 from the Alaska Satellite Facility (ASF). ASF provides these RTC products with speckle filtering at no cost (ASF, 2023). To further reduce speckle noise, we aggregated the resolution from 30 meters to 50 meters using 193 the bilinear sampling method. This study leveraged both single and dual polarization (VH, VV) SAR data to estimate VWC and crop height. To enhance our analysis, we derived the entropy, anisotropy, and alpha parameters, as introduced by Cloude and Pottier (Lee and Pottier, 2017). These polarimetric parameters provide valuable insights into vegetation structure and scattering mechanisms: Entropy (H) measures the degree of randomness in scattering (0 to 1), with values near zero indicating single scattering (e.g., smooth bare soils) and higher values indicating multiple scattering events (e.g., developing crop canopy) (Lee and Pottier, 2017; Xu and Jin, 2005). Anisotropy (E) estimates the relative importance of secondary scattering mechanisms, with zero indicating two mechanisms of equal proportions, and values approaching 1 indicating dominance of the second mechanism over the third (Lee and Ainsworth, 2010). Alpha (α) angle determines the scattering source: angles close to 0° indicate single bounce scattering, around 45° indicate volume scattering, and near 90° indicate double bounce scattering observed in developed stalks. We utilized the Sentinel Application Platform (SNAP) software to process and analyze these polarimetric parameters. To further enrich our dataset, we incorporated additional SAR indices, including the cross-ratio and the RVI. We calculated both C-band RVI : 4𝜎𝑉𝐻 0 ⁄ (𝜎𝑉𝐻 0 + 𝜎𝑉𝑉 0) (Nasirzadehdizaji et al., 2019) and L-band RVI : 8𝜎𝐻𝑉 0/(𝜎𝐻𝐻 0 + 𝜎𝑉𝑉 0 + 2𝜎𝐻𝑉 0) (Zhang et al., 2020) for our model analysis. Importantly, to account for the impact of acquisition geometry on SAR backscatter, we included the incidence angle as a feature in our ML and FMs. This inclusion allows for a more accurate interpretation of the SAR data across different acquisition conditions. 194 4.2.1.2. Sentinel-2 Optical Data To complement our SAR data, we utilized Sentinel-2 multispectral imagery, which provides valuable information on vegetation spectral properties. We acquired atmospherically corrected Sentinel-2 images for the years 2016, 2018, 2022, and 2023 through Google Earth Engine. To maintain consistency with our SAR data, we upscaled these images to a 50-meter resolution. From the Sentinel-2 data, we derived three key optical indices known for their effectiveness in estimating crop height and VWC: i) NDVI: (R𝑅𝑒𝑑 − R𝑁𝐼𝑅)/(R𝑅𝑒𝑑 + R𝑁𝐼𝑅), ii) NDWI: (R𝐺𝑟𝑒𝑒𝑛 − R𝑁𝐼𝑅)/(R𝐺𝑟𝑒𝑒𝑛 + R𝑁𝐼𝑅) and (R𝑁𝐼𝑅 − 𝑅𝑆𝑊𝐼𝑅)/(R𝑁𝐼𝑅 + R𝑆𝑊𝐼𝑅), and iii) red-edge (band 7). As illustrated in Figures 4.3, 4.6, 4.8, and 4.9, the temporal coverage of Sentinel-2 imagery exhibits gaps due to cloud cover, a limitation inherent to optical sensors. To address these data gaps and ensure a continuous time series, we applied linear interpolation between available observations. This approach allows us to maintain a consistent temporal resolution across our dataset, aligning the optical data with our SAR observations and in-situ measurements. 4.2.2. Climate data Our analysis incorporated three main climate parameters, each calculated from the crop planting date up to each time-step in the Sentinel-1A backscatter time series: i) Accumulated Precipitation — the total rainfall from planting to each observation point; ii) Minimum Temperature — the average of daily minimum temperatures from planting to each time-step; and iii) Maximum Temperature — the average of daily maximum temperatures from planting to each time-step. This approach allows us to capture the cumulative effects of weather conditions throughout the growing season, providing a more comprehensive view of the climate's impact on crop development. For Michigan (2022-2023 campaigns) non-irrigated fields, the climate data was 195 obtained from the Kellogg Biological Station Long Term Ecological Research (KBS-LTER) weather station (https://lter.kbs.msu.edu/datatables/7). This station provides high-quality, continuous meteorological observations specific to our study area. For other sites including non- irrigated fields in Iowa and irrigated fields in Michigan and Florida, a combination of satellite- derived and global precipitation products was used to collect climate datasets. We utilized Moderate Resolution Imaging Spectroradiometer (MODIS) satellite-derived products for minimum and maximum temperature data. We employed Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) for precipitation information. CHIRPS is a global rainfall dataset that combines satellite imagery with in-situ station data to create gridded rainfall time series. 4.2.3. Reference datasets In this research, the selection of case studies was strategically made based on the availability of high-quality reference datasets for parameters such as VWC, crop height, SM, SR, and SAR backscatter data at C-band and L-band frequencies. We carefully selected three distinct sites, ensuring that a comprehensive range of vegetation conditions could be analyzed. Figure 4.1 illustrates the geographical locations of the three field campaigns used in this study. The polygons in Iowa and Michigan represent the areas used for supervised learning, while the surrounding Google Earth imagery delineates the regions used for SSL. Each of these datasets is discussed in detail in the following sections. The single field in Florida, also shown on the map, was used solely for generalizing our results and was not included in either the supervised or SSL processes. 196 Figure 4.1: Geographical locations of the reference datasets and areas used for SSL. The red, blue, and green polygons indicate the regions of the reference datasets in Iowa (2016), Michigan (2022- 2023), and Florida (2018), respectively. The surrounding areas shown in Google Earth imagery represent the regions used for SSL, excluding the area in Florida. 4.2.3.1. Soybeans and Corn in Iowa: SMAPVEX2016 The SM Active Passive Validation Experiment 2016 (SMAPVEX16) campaign, conducted at the South Fork Core Validation Site (CVS) in Iowa's Corn Belt, aimed to validate SM observations by NASA's SMAP satellite (Colliander et al., 2017). Established in 2013 by the USDA Agricultural Research Service (Coopersmith et al., 2015), this site spans latitudes 42°N to 43°N and longitudes 93°W to 94°W (Figure 4.1). The region, characterized by a sub-humid climate 197 with annual precipitation of around 800 mm, primarily supports rainfed corn and soybean cultivatio0n. In 2016, two intensive observation periods (IOPs) were held: IOP-1 from May 28 to June 5, when crops were in early growth stages, and IOP-2 from August 3 to August 16, during ear and pod formation. The soil texture was predominantly silt loam, suitable for agriculture (Cosh et al., 2019). In-situ measurements in this campaign included vegetation sampling (VWC, height, and leaves count), SM, and SR. For measuring VWC, crop density and three plants per subsite for both corn and soybean fields were collected. Plants were separated into leaves, stems, and pods/ears, with each part weighed before and after drying for 5-7 days in oven with 60°C temperature to estimate water content and dry matter to calculate the VWC using the equation 1: 𝑉𝑊𝐶 = (𝑊𝑓 − 𝑊𝑑) × ρplant (1) Here, 𝑊𝑓 and 𝑊𝑑 represent the average fresh and dry weights of the three samples in kilograms, respectively, while ρplant is the average number of plants per square meter. In corn fields, plant VWC distribution typically comprises 50% in the stem, 20% in leaves, and 30% in the ear, while in soybeans, it is either 60% in the stem, 20% in leaves, and 20% in pods, or 70% in the stem, 20% in leaves, and 10% in pods. This detailed breakdown aids in assessing the penetration levels of C-band, L-band SAR and optical sensors for VWC and crop height estimation. SM was measured using reflectometry probes, calibrated against gravimetric methods for high accuracy. SR was measured using lidar, pinboard, and gridboard techniques (Figure 4.2a), with pinboard measurements selected for this study for consistency with other roughness measurements in Michigan 2023 campaign (Sec. 4.2.3.b). The pinboard technique, applied in 21 198 fields, involved capturing soil profiles through photography using a 1-meter transect with pins set 5mm apart (Figure 4.2a). a) SMAPVWX16 b) Michigan-2023 Figure 4.2: Photos of Pinboard sampling, a) SMAPVEX16 Taken from (Walker et al., 2023) and b) Michigan-2023. In Figure 4.3, the revisit times for SMAPVEX16 measurements are displayed. As shown in Figure 4.3, VWC measurements align with four Sentinel-1A revisit times (DOYs 153, 201, 213, and 225). SM measurements correspond to DOYs 153 and 225, and SR was measured at the beginning of the season on DOY 153. Sentinel-2 data align with three Sentinel-1A revisit times (DOYs 153, 213, and 225) and it was interpolated for DOY 201. Figure 4.3: Temporal distribution of Sentinel-1A, Sentinel-2, VWC, SM, and SR measurements in SMAPVEX16 campaign. 199 The VWC and SM measurements have been averaged across each 50-meter grid area of Sentinel-1A. Due to the lack of repeated measurements at the same locations on this site, the training dataset from this site include vegetation measurements from only 1–2 time-steps in the growing season. 4.2.3.2. Soybeans and Corn in Michigan 2022-2023 Michigan 2022 In 2022, we personally conducted a field campaign in Michigan, encompassing three corn fields (DOY planting: 132 and 136, DOY harvest: 293 and 311) and three soybean fields (DOY planting (130), DOY harvest (276 and 278). Four fields were located within the Kellogg Biological Station, characterized by non-irrigated and well-plowed conditions (Figure 4.4), while the remaining two fields were situated on privately owned farmland, featuring irrigation and crop residue from the previous year's corn plantation (Figure 4.5). A comparison of Figure 4 and 5 highlights the contrasting tillage status between the non-irrigated fields at the Kellogg Biological Station (Figure 4.4) and the privately owned irrigated farms (Figure 4.5). Regular measurements were taken every 12 days from June 21st to October 7th to align with the revisit time of the Sentinel-1A satellite. These measurements included crop height, leaf count, and crop density (number of plants per square meter). For VWC measurements, we employed a systematic sampling approach. We randomly selected three crop samples from the center of each Sentinel-1 grid pixel, with the number of pixels ranging from 2 to 10 (as illustrated in Figure 4.4 and 5). These samples were carefully collected and sealed in plastic bags to preserve their moisture content. The samples were then dried in an oven at 60 degrees Celsius for 7 days. VWC for both soybean and corn were calculated using Equation 1, which incorporates the wet and dry weights of the samples. 200 Figure 4.4: Rainfed corn and soybean fields in kellogg bioligical center, Michigan 2022 field campaign. Figure 4.5: Irrigated corn and soybean fields on private farms in Michigan, 2022. This image captures the varied impact of high crop residue in a soybean field. 201 Figure 4.6 illustrates the temporal distribution of Sentinel-1A, Sentinel-2, and VWC measurements throughout the growing cycle for all corn and soybean fields in this campaign. Despite the diverse SR conditions encountered, SR and SM measurements were not conducted for this campaign. Furthermore, given the limited number of irrigated fields available, we leveraged the irrigated fields from this campaign to evaluate our model's performance and assess its ability to adapt to new conditions. Figure 4.6: Temporal distribution of Sentinel-1A, Sentinel-2, and VWC through out the growing cycle of corn and Soybean fields in Michigan 2022 campaign. Michigan 2023 In 2023, the Michigan 2022 campaign's measurements were replicated across six rainfed fields in Kellogg Biological Center, encompassing three corn fields (planting: 125, 126, and 132; harvest: 277, 283, 300, and 319) and three soybean fields (planting: 102, 103, 138; harvest: 283, and 276). This campaign measurements included enhancements such as the addition of probe- based SM, and SR measurements using Pinboard (Figure 4.2b). All the fields were rainfed; among them, two fields each of corn and soybean were well-plowed, while one field each of corn and soybean contained crop residue (Figure 4.7-S2) and cover crop (Figure 4.7-C1). The measurements were conducted every 12 days based on the Sentinel-1 revisit time in the center of 1-2 pixel area 202 per fields. For SR measurements, we employed a specialized pinboard instrument (Figure 4.2b). This device consists of a 4-foot (1.2192 m) frame equipped with 62 precisely arranged pins. Each pin measures 3 mm in diameter and is positioned at 20 mm intervals along the frame. The design allows the pins to conform to the soil's topography when the board is placed on the ground, providing an accurate representation of the surface roughness. In each field, we captured six photographs, covering 3-meter transects in both South-North and West-East directions of cultivation. We carefully avoided potential obstacles such as dense crop residue, young plants, or footprints. The 3-meter length was chosen to achieve a precision of ±10% when measuring the root mean square (RMS) height and correlation length (Oh and Kay., 1998). Figure 4.7: Exploring soil surface conditions in Michigan's 2023 corn (C) and soybean (S) fields: This image captures the varied impact of high crop residue levels in S2 (soybean) and cover crops in C1 (corn) plots. Figure 4.8 illustrates the time distribution of Sentinel-1A, Sentinel-2, VWC, SM, and SR measurements throughout the growth cycle for corn and soybean fields in this campaign. The Sentinel-2 observations have been linearly interpolated to match the Sentinel-1A revisit times for unified dataset construction. 203 Figure 4.8: Temporal distribution of Sentinel-1A, Sentinel-2, VWC, SM, and SR through out the growing cycle of corn and soybean fields in Michigan 2023 campaign. 4.2.3.3. Florida 2018 The 2018 field campaign in Florida, USA, was centered in Citra (29.4100 N, 82.1790 W) at the Plant Science Research and Education Unit (PSREU) of the University of Florida and the Institute of Food and Agricultural Sciences (UF/IFAS) (Figure 4.1). This campaign focused on a single sweet corn field planted in sandy soil, with a crop density of 8 plants per square meter. The crop, intended for human consumption, was harvested after 66 days, in mid-June. The region's climate is classified as humid subtropical, and the spring growing season of 2018 was marked by high temperatures, intense rainfall, and frequent thunderstorms. Sweet corn was sown on April 13 and harvested on June 18. Throughout the growing season, center-pivot irrigation was employed as required. This irrigation was typically carried out late in the evening, especially during the early season's dry spells, to ensure adequate water supply (Khabbazan et al., 2022). Throughout the entire growing season, vegetation sampling was carried out every 2-3 days before dawn to assess VWC. During each sampling session, 8 plants, representative of the field conditions, were selected from these areas and the VWC were measured with the same method detailed in SMAPVEX16 and Michigan 2022. 204 A key component of this campaign was the monitoring of a corn field using a truck-mounted, fully polarimetric, L-band radar. The high temporal resolution L-band backscatter data were collected using the University of Florida L-band Automated Radar System (UF-LARS). The scatterometer systematically surveyed the corn field at a 40-degree incidence angle, capturing 16 observations spread across the day during the late season. Besides L-band scatterometer data, the field campaign setup enabled continuous measurement of SM at 15-minute intervals over a period of 58 days. For the purpose of this study, the VWC, SM, and L-band data were aggregated to match the Sentinel-1A revisit time. Figure 4.9 illustrates the measurement revisit times in this field campaign compared to Sentinel-1A and Sentinel-2 observations. We are missing one Sentinel-1A image on DOY 152, and overall, 4 Sentinel-1A images cover the whole growing season. We used linear interpolation to calculate the Sentinel-2 observations for the same revisit times as Sentinel- 1A. For detailed information on the sensor and measurement methodologies, readers are referred to Vermunt et al. (2021). Figure 4.9: Temporal distribution of UF-LARS, Sentinel-1A, Sentinel-2, VWC, and SM in Florida. Drawing on the SAR, optical, and reference datasets detailed in Table 4.1 from all four field campaigns, our study design and data analysis strategy are structured as follows: 205 Irrigation Status: Due to the limited number of irrigated fields compared to rainfed ones, we used the irrigated fields for model generalization testing. Our ML and FMs were primarily trained on data from rainfed soybean and corn fields. SAR Band Availability: L-band data was only available for the Florida campaign. Consequently, we trained our models using C-band Sentinel-1A data. The Florida L-band data was utilized in the generalization section to discuss different SAR frequencies impact on VWC and crop height estimation. SM and SR: The availability of these measurements varied across campaigns. To account for this, we conducted two types of analyses: a) Feature importance using Iowa 2016 and Michigan 2023 to investigate the impact of SM and Roughness on VWC and crop height estimation, and b) A comprehensive analysis using data from Iowa 2016 and Michigan (2022-2023) campaigns, excluding SM and SR to estimate VWC and crop height. This stratified approach allows us to maximize the utility of our diverse dataset, addressing the challenges posed by varying measurement availability across different campaigns and locations. Table 4.1: Overview of the case studies measurements. Campaign Crops C-band L-band In situ SM SR* VWC Height Iowa 2016 Florida 2018 corn- soybean corn Michigan 2022 corn- soybean Michigan 2023 corn- soybean *Surface roughness (SR) ✓ ✓ ✓ ✓ ✗ ✓ ✗ ✗ ✓ ✓ ✓ ✓ ✓ ✗ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✓ ✓ Irrigation status Rainfed Irrigated Rainfed and irrigated Rainfed 206 4.2.4. Surface roughness computation For the Michigan 2023 campaign, we employed the pinboard method to measure SR. Six photographs were taken per field, both across-row and in-row, at the center of each Sentinel-1 pixel. These images were processed using Fiji ImageJ software to calibrate dimensions and trace the soil surface at 5mm intervals, creating a topographic profile. To minimize human-induced errors in pinboard data analysis, we established specific guidelines, such as marking the center of each pin and excluding broken or stuck pins. The roughness profile reveals variations in soil heights, quantified using two key parameters: i) Standard deviation of surface height (𝜎s), also known as random roughness or RMS height, and ii) Correlation length (lc). These parameters are calculated after adjusting for the mean slope of the surface, using the following formulas Walker et al. (2023): 𝜎𝑠 = √ 1 𝑁 − 1 𝑁 ∑ (𝑧(𝑗) − 𝑗=1 1 𝑁 𝑁 2 ∑ 𝑧(𝑖) ) 𝑖=1 𝜌(𝑛) = 1 (𝑁 − 𝑛) − 1 ∑ 𝑁−𝑛 𝑗=1 𝑧(𝑗)𝑧(𝑗 + 𝑛) 1 𝑁 − 1 ∑ 𝑁 𝑗=1 𝑧(𝑗)2 (2) (3) where 𝑁 is the total number of points on the surface, 𝑧(𝑗) is the height of the 𝑗th point on the soil surface, and 𝑙𝑐 is the lag, 𝑛, where the autocorrelation function 𝜌(𝑛) drops below 𝑒 −1. To feed our ML and DL models, we calculated the roughness parameter SR both in-row and across-row using equations 4-6 from Lawrence et al. (2013). For input into our ML and DL models, we calculated in-row and across-row roughness parameter SR using equations 4-6 from Lawrence et al. (2013): 207 𝑍𝑠 = 2 𝜎𝑠 𝑙𝑐 2.62 (1 − 𝑒− 𝑍𝑠 2.993) (𝑍𝑠 < 1.553 𝑐𝑚) 𝑆𝑅 = { (4) (5) 0.853 (𝑍𝑠 > 1.553 𝑐𝑚) For training the models, we utilized the in-row and across-row roughness parameter h, calculated for the Michigan 2023 dataset and derived from pinboard measurements in the SMAPVEX-16 campaign. These parameters, as documented by Walker et al. (2023) for SMAPVEX-16, are detailed in Tables C1 and C2 (appendix). 4.3. Methodology In this study, our primary objective is to estimate the VWC and crop height for non-irrigated soybean and corn fields across three different growth seasons and two geographical locations with distinct climates. While the main purpose of this study is VWC estimation, we incorporated crop height estimation to facilitate MTL and to evaluate its impact on enhancing VWC estimation. Given the limited size of our labeled dataset, we developed a FM and assessed its performance against traditional shallow ML methods, RF and XGB. Details on the FM and ML methods are provided below. To further refine our model, we utilized labeled datasets from irrigated fields in Michigan and Florida for generalization. The FM, including all pre-training and fine-tuning procedures, and ML methods are fully documented and accessible for reproducibility in our publicly available repository at https://github.com/MahyaSad/FoundationModel- CropBiophysicalParameters. 4.3.1. Foundation models Background FMs with SSL have shown great promise for crop mapping and classification purposes using RS imagery, particularly when labeled training data is limited. These approaches leverage spatial, temporal, and multi-modal information in satellite image time series (SITS) data to learn rich, transferable feature representations that enhance supervised learning performance using small, 208 labeled dataset. Contrastive learning and masked autoencoding (MAE) are widely used SSL strategies in RS. Contrastive learning constructs positive pairs from images of the same location at different times to learn similar representations while distinguishing them from negative pairs (Manas et al., 2021). While contrastive learning has been used with RS data for different downstream tasks (Chen et al., 2020; Fuller et al., 2023; He et al., 2020; Scheibenreif et al., 2022; Xu et al., 2024), it is difficult to apply this approach to SITS due to the difficulty of designing augmentations for temporal sequences (Tsaris et al., 2024). MAE randomly masks input patches and learns to reconstruct the missing pixels, enabling the learning of useful representations from unlabeled data without requiring positive/negative pairs. This is advantageous for crop mapping and monitoring using SAR, as self-supervision using MAE can capture unique backscatter signatures. Several studies have used self-supervision using MAE for various downstream tasks, such as land cover classification, building segmentation, and scene classification (Bountos et al., 2023; Cha et al., 2023; Cong et al., 2022; Fuller et al., 2023; Sun et al., 2022; Tsaris et al., 2024; D. Wang et al., 2022). These studies employed ViT architecture with varying sizes, from 87 million to 3 billion parameters, and introduce techniques such as rotated varied-size attention (RVSA) and novel positional encodings to enhance robustness and generalization. The scaling of ViTs to billion-parameter models highlighting the potential of large- scale FMs for RS applications. The models are pretrained on large-scale datasets, such as SSL4EO, MillionAID, and USatlas, containing millions of radars optical, and multi-sensor samples. Self-Supervised Learning Using Geospatial Foundation Models Our proposed methodology initiates with a SSL phase, employing a geospatial FM based on the ViT architecture. This model is specifically designed to learn rich spatial representations from unlabeled satellite imagery via a MAE approach. The initial step involves data preparation, where Sentinel-1A images from three growing seasons in Iowa (2016) and Michigan (2022-2023) are 209 segmented into fixed-size patches. These patches, which include three features—VV, VH, and incidence angle—are linearly embedded before being input into the model. A similar process is applied to optical images from Sentinel-2, which incorporate the three main indices: NDVI, NDWI, and red-edge. As illustrated in Figure 4.10, the SSL processes for SAR and optical data are conducted separately; consequently, there was no need to interpolate optical data to match the SAR revisit times. The optical images used were readily available during the growing season of Michigan (2022-2023) and Iowa (2016), as depicted in Figures 4.3, 4.6, and 4.8. The Flowchart in Figure 4.10 illustrates a key aspect of our methodology where the ViT functions as the encoder within the MAE framework, processing solely the unmasked patches. It is structured with multiple layers of transformer blocks, each containing multi-head self-attention mechanisms and feedforward neural networks. This architecture allows the encoder to focus its computational resources on understanding the visible parts of the input, developing internal representations that infer the missing content based on the spatial context provided by the unmasked patches. The decoder network, which mirrors the architecture of the encoder but is optimized for the task of reconstruction, attempts to regenerate the full input image from the encoded latent representations. The goal of the decoder is to predict the appearance of the masked patches, thereby learning to fill in missing information effectively. The primary objective during pre-training is to minimize the reconstruction error between the original full input images and the outputs predicted by the decoder. This process encourages the development of an internal representation that captures essential spatial features within the data, which is crucial for understanding and interpreting satellite imagery. Our model has approximately 15 million parameters, which is considered a moderate-scale model in the context of DL. The training dataset includes three 210 growing seasons and total of 4000 patches with 256 size and three channels for each of the SAR and optical training process. The hyperparameters of this model, detailed in Table 4.2 were optimized using training and validation datasets that included approximately 262 million pixels (50-meter resolution) for each of the three channels, separately for both optical and SAR data. Table 4.2: Hyperparameter Settings for the MAE in SSL phase. Hyperparameter Value Key Justification Patch Size 256 Captures fine-grained details in high-res imagery Encoder Dim 768 Balances model capacity and computational efficiency Encoder Layers 12 Enables learning of hierarchical features Attention Heads 16 Allows focus on diverse data aspects simultaneously Decoder Dim 512 Ensures high-quality reconstruction of masked patches Decoder Layers 1 Adequate for reconstruction while maintaining efficiency Learning Rate 6.777e-05 Optimized for convergence speed and model stability Batch Size 16 Balances GPU memory constraints and generalization Mask Ratio 0.136 Encourages learning of robust features across image regions Epochs 50 Sufficient for model convergence The pre-training phase results in a set of encoder weights that encapsulate the spatial characteristics learned from the SAR and optical datasets. These weights form a sophisticated spatial understanding that can be leveraged in subsequent supervised tasks. According to the flowchart in Figure 4.10, the pre-trained encoder weights from each SAR and optical training session were fine-tuned within the supervised learning architecture. 211 Spatio-Temporal Supervised Single Task and Multi Task with Fine-Tuning Geospatial FMs Our methodology extends the use of pre-trained ViT from the MAE setup to a supervised MTL and STL frameworks for estimating VWC and crop height. This approach leverages rich spatial features learned during the MAE pre-training phase and captures temporal dynamics through a 1D Convolutional Neural Network layer (1D-Conv). The details of the model are illustrated in Figure 4.10 and are as follows: Two pre-trained ViT models are initialized: one for SAR data and another for optical data. These models are loaded with pre-trained weights, excluding the decoder parts, allowing us to utilize the rich spatial features captured during the MAE pre-training using a large unlabeled SAR and optical data. These pre-trained ViT models form the backbone of our feature extraction process. The radar and optical images are processed independently through their respective ViT encoders. The outputs from these encoders are then fused by concatenating the features from both encoders, a technique known as mid-level fusion. Among the different fusion techniques—early, mid, and late/decision fusion—mid-fusion and late-fusion have shown better classification accuracy in studies using DL with SAR observations for crop classification (Saadat et al., 2022; Yuan et al., 2023). However, mid-fusion offers a pragmatic balance between accuracy and computational efficiency (Garnot et al., 2022). Given these advantages, we opted for mid-fusion using feature concatenation in our study. We employed concatenation to merge the additional features with the previously fused dataset, thereby integrating VH/VV ratios, RVI, climate variables, SR measures, SM levels, and polarimetric decomposition parameters into a 1D-Conv layer. In detail, the 1D-Conv layer is implemented to analyze temporal sequences by sliding convolutional filters over the time dimension of the fused features. This method effectively extracts temporal features, critical for 212 understanding changes in VWC and crop height over the growing season. The layer configuration, including the number of filters and kernel size, has been optimized through extensive hyperparameter tuning to maximize the extraction of useful temporal information. The extracted features are then fed into two linear projection layers—one for VWC and another for crop height— enabling the model to perform MTL efficiently. In scenarios where STL is applied, only the relevant linear projection layer is used. The entire model, comprising the pre-trained encoder, 1D-Conv, and linear projection, is trained end-to-end, optimizing all parameters for maximal performance on the regression tasks. We employ the Adam optimizer with a Cosine Annealing Learning Rate scheduler and apply early stopping based on validation loss to prevent overfitting. 213 Figure 4.10: Comprehensive Workflow of the MTL-FM: Illustrating the Pre-training and Fine- tuning Phases for VWC and Crop Height Estimation. Training/Validation and Test Composition In this study, we analyzed non-irrigated soybean and corn data collected over multiple years (2016, 2022-2023) and locations (Michigan and Iowa) for a supervised learning task. Specifically, our Soybean dataset includes data from 6 pixels across three different farms in 2023, each recorded at 10 time-steps throughout the growing season; data from 6 pixels across two different farms in 2022, each also captured at 10 time-steps; and data from 50 pixels in 2016, each with a single time- 214 step that varies throughout the season, covering early (May) and mid-season (August). This results in a total of 62 pixels and 192 time-steps for the soybean dataset. For corn, the dataset comprises data from 5 pixels across three different fields in 2023 with 10 time-steps each; data from 6 pixels in 2022 across two different fields, also with 10 time-steps each; and data from 119 pixels in 2016, each with a single time-step during early and mid-season. We implemented a pixel-wise data splitting strategy for training, validation, and testing phases. Initially, 14 pixels were reserved for the soybeans test set, comprising two pixels from each of 2022 and 2023, along with 10 pixels from 2016 and 18 pixels were reserved for corn test set, comprising one pixel from 2023 and two pixels from 2022 and 18 pixels from 2016. For hyperparameter tuning and early stopping, we employed 10-fold cross-validation on the remaining pixels. This structured approach ensures an even distribution of data, robust model evaluation, and optimal parameter selection while maintaining a distinct separation between the test set and other data sets, thus providing an unbiased assessment of the model’s generalization capabilities. To accommodate the ViT model, which requires a fixed patch size, we replicated the data to extend each pixel to an 8x8 dimension. The optimized key hyperparameters of supervised learning model has been shown in Table. 4.3. The objective function is designed to minimize the mean squared error loss, which includes the summation of losses for both VWC and crop height predictions. The model is implemented using PyTorch. Table 4.3: Optimized Hyperparameters for Supervised Learning Models. Ir Patch Size Batch Size Patience- early stopping 1D-Conv layer 1D-Conv size Dropout Rate Weight Decay lr_cosine init lr_cosine cycles 0.00024 8 8 13 1 1024 0.136 2.95 e- 05 0.00999 4 Cosine Annealing Initial LR (lr_cosine_init), Cosine Annealing Cycles (lr_cosine_cycles) 215 4.3.2. Machine Learning methods To evaluate the FM's performance, we selected two shallow ML methods, Random Forest (RF) and XGBoost (XGB), which are commonly used for crop classification, crop monitoring, and yield prediction purposes. These ML methods were implemented using the open-source Python Scikit-learn package. To ensure a fair and robust comparison, we employed identical training and test datasets for both the traditional ML methods and the FMs. Furthermore, we implemented a 10-fold cross-validation strategy across all models. To integrate SAR, optical, and auxiliary features, we implemented an early fusion strategy. This approach involved concatenating all feature arrays into a unified input matrix prior to model ingestion. It's notable that these traditional ML methods have limitations when it comes to capturing temporal dependencies compared to temporal DL models. Furthermore, while our FMs could perform both STL and MTL, we were only able to implement STL with the ML methods due to their inherent architectural constraints. 4.3.2.1. Random Forest Regression RF is a robust ensemble learning method widely used in classification and regression problems (Breiman, 2001). Ensemble learning involves producing multiple models and combining them to solve a particular task, with common types being boosting and bagging. RF, a successful bagging approach, consists of numerous individual decision trees, each making its prediction. The model combines all predictions to enhance performance (Liaw and Wiener, 2002). RF is particularly advantageous for crop biophysical parameter estimation due to its ability to handle high-dimensional data and capture non-linear relationships between features(Belgiu and Drăguţ, 2016). Additionally, RF provides feature importance rankings, which is valuable for understanding the relative impact of different SAR and optical indices on crop biophysical parameters. 216 4.3.2.2. Gradient Boosting and Extreme Gradient Boosting XGBoost (XGB), a popular implementation of Gradient Boosting, enhances model performance, speed, flexibility, and efficiency (Chen and Guestrin, 2016). XGB has gained widespread adoption in ML competitions and practical applications due to its superior performance and scalability. It builds upon the principles of gradient boosting introduced by Friedman. (2001), but incorporates several improvements. These include a more regularized model formalization to control overfitting, which was a common issue in gradient boosting machines (GBMs). XGB features advanced tree optimization techniques, built-in cross-validation, and efficient handling of missing values, all of which contribute to its robustness and performance (Chen and Guestrin, 2016). In contrast to RF, XGB builds trees sequentially, with each new tree correcting the errors of the previous ones. Traditional ML methods have limitations when it comes to capturing temporal dependencies compared to temporal DL models. ML methods cannot capture the temporal dependencies in the SAR and optical data as effectively as more advanced time series analysis techniques (Ienco et al., 2017). It's important to note that, ML methods are intrinsically designed optimize for a single output variable at a time (Ruder, 2017); therefore, we performed MTL only with FMs. This limitation restricts their ability to simultaneously predict multiple related tasks, such as VWC and crop height, in a single model framework. Despite these limitations, we chose to include both RF and XGB in our study to provide a comprehensive comparison with more FMs and to leverage their respective strengths in handling different aspects of SAR and optical datasets. 217 4.4. Result 4.4.1. Self-Supervised Learning (SSL) Performance The SSL performance was evaluated for both SAR and optical datasets over selected areas in Michigan and Iowa, as shown in Figure 4.1. Due to computational resource limitations, the training area was confined to the region surrounding the referenced dataset. Future research could explore training with larger areas, such as entire states or continental regions. Figure 4.11 illustrates the reconstruction results for the Sentinel-1A VH channel and Sentinel-2 NDVI images at different stages of the SSL process. For both SAR and optical images, we observe a clear progression in the quality of reconstruction as the training progresses. At the 10th epoch (Column c), the reconstructed images show a rough approximation of the masked areas. By the 30th epoch (Column d), the reconstruction quality improves significantly, with more detailed features becoming apparent. At the 50th epoch (Column e), the reconstructed areas closely resemble the original images, demonstrating the model's ability to learn and reproduce complex spatial patterns in both SAR and optical data. Figure 4.12 presents the loss progression for the SSL models on both Sentinel-1A and Sentinle-2 datasets. 218 Figure 4.11: Reconstructed examples on Sentinel-1A VH channel and Sentinel-2 NDVI images. (a) denotes masked images; (b) denotes reconstructed results of pre-trained model at the 10th epoch; (c) denotes reconstructed results of pre-trained model at the 30th epoch; (d) denotes reconstructed results of pre-trained model at the 50th epoch; (e) denotes the original images. Figure 4.12: loss progression for the SSL models on Sentinel-1A and Sentinel-2 datasets. 219 4.4.2. SAR and Optical Feature Relationships with VWC and Crop Height The relationships between VWC, crop height, and SAR and optical features are illustrated in Figures 4.13 and C1 (appendix) for soybean, and Figures 4.15 and C2 (appendix) for corn. SAR features, including VV and VH backscatter, show similar positive correlations with both VWC and height across various field conditions, though these relationships are not consistently strong. RVI does not show a strong relation with VWC, and height compared to VH and VV. Among optical features, Red-edge reflectance and NDVI demonstrate strong positive correlations with both VWC and crop height, exhibiting clear and consistent relationships across most fields. Notably, NDWI displays a pronounced negative correlation with both parameters. It's notable here that (R𝐺𝑟𝑒𝑒𝑛 − R𝑁𝐼𝑅)/(R𝐺𝑟𝑒𝑒𝑛 + R𝑁𝐼𝑅) definition of NDWI was more effective in both VWC and height estimation; therefore, from here on, when we refer to NDWI, we mean NDWI derived from this definition. The SM shows variable relationships across different fields, highlighting the influence of local conditions, management practices and precipitation pattern. 220 Figure 4.13: Relationship between VWC and various SAR and optical features for soybean fields. The features include SAR-derived (VH, VV, RVI), optical (Red-edge, NDVI, NDWI), and SM parameters. Panels (a-c): Three different non-irrigated soybean fields in Michigan, 2023; (d-e) Uphill and downhill sections of a non-irrigated soybean field in Michigan, 2022; (f) Another non-irrigated soybean field in Michigan, 2022; and (g) An irrigated soybean field in Michigan, 2022 with no tillage and high crop residue, used for model generalization testing. NDWI is calculated using (R𝐺𝑟𝑒𝑒𝑛 − R𝑁𝐼𝑅)/(R𝐺𝑟𝑒𝑒𝑛 + R𝑁𝐼𝑅). In each panel, the horizontal axis is showing the DoY, the left vertical axis is showing the SAR/Optical features, and the right vertical axis is showing VWC (kg/m2). 221 Figure 4.14: Relationship between VWC and SAR and optical features for corn crops. Features include C-band SAR (VH, VV, RVI), L-band SAR (HH, VV, RVI, cross-pol), optical (Red-edge, NDVI, NDWI), and soil moisture (SM) parameters. Panels show corn fields from Michigan (a-h) and Florida (i) under various conditions: (a-b) uphill/downhill sections with cover crop effects; (c-e) different SM and VWC patterns; (f-g) varying stress conditions; (h) irrigated; and (i) irrigated with L-band SAR data. 222 Figure 4.14 illustrates the relationships between VWC, SAR, and optical features for corn crops across diverse conditions. Of particular note are panels (a) and (b), which show an apparent anomaly where the downhill section has higher SM but lower VWC, while the uphill section shows the opposite. This unusual pattern is attributed to the use of a diverse cover crop (including red clover, alfalfa, chicory, and annual ryegrass) prior to corn planting. The cover crop was terminated in early May, followed by corn planting. Subsequent dry conditions from May to late June likely favored corn growth in higher elevation areas, potentially due to residual effects of the cover crop on soil structure and moisture retention. This observation highlights the complex interactions between management practices, topography, and crop water dynamics. Figure 4.14i compares L-band and C-band SAR features for a Florida corn field, showing similar behavior across VV, VH, and RVI. This suggests C-band Sentinel-1A may have comparable capability to L-band SAR for estimating corn VWC and height. However, panels g and h display abnormal SAR features, potentially affecting model performance. These anomalies warrant further investigation to understand their impact on our results. Figures C1 and C2 (appendix) displays an important distinction between corn and soybean crops near harvest which is their contrasting physical responses. Corn maintains its structural height even as it matures and dries, resulting in relatively stable height measurements. However, its VWC and consequently NDVI decrease dramatically as the plant dries out. In contrast, soybeans undergo a more pronounced physical change. As they approach harvest and lose leaves, soybean plants tend to wilt and flatten, especially under windy conditions. This flattening effect can lead to a significant reduction in the effective height that reflects energy back to satellites, even though the actual plant height hasn't decreased as much. 223 4.4.3. Feature Combination Analysis In this section, we employed SHAP (SHapley Additive exPlanations) (Lundberg and Lee, 2017) for feature selection and analysis. SHAP, a method grounded in cooperative game theory, is a powerful tool for interpreting the predictions of both DL and ML models. By assigning SHAP values to each feature, this method quantifies the contribution of each feature to the model's output, enabling a nuanced interpretation of both the significance and direction of influence of each feature on the prediction. We employed RF for feature importance analysis due to its suitability for smaller datasets, such as our limited samples from Michigan (2023) and Iowa (2016), excluding Michigan 2022 due to missing soil moisture and surface roughness data. RF provides a robust and interpretable method, essential for consistency given the reduced sample size. Conducting feature importance analysis with the foundation model is computationally intensive, so RF was used instead. However, in section 4.4.4, we evaluated all models across 22 feature combinations and compared their performance. SHAP summary plots for various features in estimating crop height and VWC using RF model is presented in Figure 4.17. The plots illustrate the relative importance of optical, SAR features, and weather parameters. Features include optical indices (Red-edge, NDVI, NDWI), SAR backscatter and indices (VH, VV, VH/VV, RVI), SAR polarimetric parameters (Entropy, Anisotropy, Alpha), incidence angle, weather data (P: rainfall, Tmin: minimum temperature, and Tmax: maximum temperature), and soil surface measurements (SM, in-row and across-row SR). 224 Figure 4.15: SHAP summary plots illustrating feature importance in RF model for crop height and VWC estimation (2016 and 2023). Bars represent mean absolute SHAP values; longer bars indicate greater feature importance in model predictions. For soybean VWC estimation, optical indices (NDVI and NDWI) emerged as the most important features, with NDVI having a greater impact. Precipitation (P) ranks second, close to NDWI. Among SAR backscatter parameters, VH is more effective as the fourth-ranked feature, while VV shows minimal importance. This could be due to VH's sensitivity to vegetation structure and water content, because it interacts with the volume scattering properties of the vegetation, whereas VV is more affected by surface scattering which is affected by the soil SR and moisture content. SM has minimal importance, while SR in-row is more effective compared to across-row, possibly due to row orientation relative to the sensor. VH/VV is more effective than RVI and VV. Among polarimetric decomposition parameters, entropy has a higher rank, but overall, they have minimal impact. 225 a) Soybean VWC b) Soybean height c) Corn VWC d) Corn height In contrast, soybean height estimation was predominantly influenced by P, which far outweighed other features in importance. Following P, similar to VWC estimation, NDWI and NDVI have almost the same impact and rank second. As with VWC, VH is more effective compared to VV, VH/VV, and RVI. The impact of polarimetric decomposition parameters is almost zero. SM is more effective here compared to VWC estimation, but its impact is minimal compared to P, optical indices, and VH. Unlike VWC estimation, across-row SR is more effective here compared to in-row SR. For Corn VWC estimation, NDWI shows the highest importance, followed by P and NDVI in the second and third ranks, respectively. Unlike soybean, corn VWC estimation finds SR across rows to be more important, whereas for height estimation, in-row roughness is more significant. SM has a minimal impact on both Corn VWC and height estimation. Among weather parameters, after precipitation (P), Tmax is more effective than Tmin for both VWC and height estimation. Incidence angle and red-edge indices have moderate impacts on both VWC and height estimation. Regarding SAR features, contrary to soybeans, Corn VWC estimation shows VV slightly more important than VH (with a very minimal difference). For height estimation, SAR indices (VH/VV and RVI) surpass individual VH and VV polarizations in importance. Similar to soybeans, SAR polarimetric decomposition parameters are not as influential compared to other features for corn estimations. Based on the above feature importance analysis, we selected 22 optimal feature combinations to evaluate the performance of RF and XGB and FMs. This approach allowed us to systematically assess the effectiveness of different feature sets in estimating corn and soybean VWC and height. As explained in the methodology section, traditional ML methods were applied for STL, focusing on either VWC or height estimation individually. However, we extended our evaluation to include 226 SSL with FMs in both STL (only VWC or only height) and MTL (height and VWC simultaneously) scenarios. Our hypothesis was that MTL should improve accuracy, as each task can provide important complementary information to the other, potentially enhancing overall model performance. It is important to note that in the feature combination of SAR and optical alone, the pre-trained encoders for both SAR and optical have been individually fine-tuned within the FM's supervised learning framework, without any integration of their features. The performance of these STL and MTL approaches across 22 feature combinations for both soybean and corn VWC estimations is visualized in Figure 4.16 and Figure C3 (appendix), and for height estimation in Figure C4 and Figure C5 in the supplementary material. The surface measurements (SM and SR) are not included in this combination as Michigan 2022 datasets lacks this measurement. As it was shown in Figure 15 SM and both in-row and across-row roughness are not significant for soybean VWC and height estimation compared to the other SAR, optical and weather features. However, in-row SR and across-row are important after precipitation, NDVI and NDWI for corn VWC and height estimation. Our analysis of the heat map figures reveals several key insights: 1) Optical features alone outperform SAR features in estimating crop biophysical parameters, even with interpolated Sentinel-2 data for missing dates. However, the addition of SAR features enhances the ultimate estimation accuracy, indicating that SAR complements optical data in crop parameter estimation. 2) Contrary to our hypothesis, MTL with FMs yields slightly lower R² and higher MAE compared to STL for VWC and height estimation. This suggests a trade-off in model performance when simultaneously optimizing for multiple parameters. 3) The performance comparison between FMs and ML methods shows while RF outperforms FMs in some feature combination, in the ultimate feature combination (Table. 4.4) STL-FM outperforms RF and XGB. Particularly for SAR and 227 optical features in isolation, STL-FM outperforms RF (not all the combination) possibly due to the SSL approach's effectiveness with the SAR and optical feature types. The STL-FM using the optimal feature combination outperforms traditional ML methods across most parameters, with the notable exception of soybean height estimation. 4) Overall, RF demonstrates superior performance compared to XGB except for corn VWC that they have the same performance. 5) The inclusion of polarimetric decomposition features such as entropy, anisotropy, and alpha angle did not yield significant improvements in both VWC and height estimation accuracy. 6) The incorporation of weather parameters consistently enhanced model performance across all feature combinations for both VWC and height estimation. 7) For soybean, red-edge index did not provide substantial benefits, while NDVI and NDWI showed comparable importance. For corn the combination of the three optical indices had the highest value. 8) Among SAR-only features, the combination of VH, VH/VV ratio, and RVI proved most effective for both soybean and corn. SAR features (VH, VH/VV, RVI) alone achieved a MAE of 0.5 and R² of 0.8 for soybean VWC estimation. For corn, combining these SAR features with climate data resulted in an MAE of 0.92 and R² of 0.83 for VWC estimation, highlighting the significant role of precipitation in corn VWC and height estimation. 9) VH polarization outperforms VV for soybeans, while both polarizations are important for corn parameters estimation. 228 Figure 4.16: Performance metrics for soybean VWC estimation using different feature combinations and model types. Entropy (H), Anisotropy (E), tmin and tmax are the maximum and minimum temperature. Unit of MAE and RMSE are kg/m2. 4.4.4. Best Combination Results Table 4.4 displays the feature combinations that yielded the best performance metrics— lowest MAE and RMSE, and highest R²—for estimating VWC and height (H) of corn and soybeans, as depicted in the Figures 4.16 and C3-5 (appendix). 229 Table 4.4: Best feature combination results. STL-FM MTL-FM STL-RF STL-XGB Features VH, NDVI, NDWI, climate VH, VV, Red- edge, NDVI, NDWI, climate NDVI, NDWI, climate VH, VHVV, RVI, Red- edge, NDVI, NDWI, climate Metric MAE 0.36 RMSE 0.51 R2 0.87 MAE 5.68 RMSE 7.36 R2 0.96 MAE 0.71 RMSE 1.08 R2 0.89 MAE 16.07 RMSE 35.83 R2 0.87 Task Soybean VWC Features VH, Red-edge, NDVI, NDWI, climate Soybean Height VH, NDVI, NDWI, climate Metric MAE 0.3 RMSE 0.44 R2 0.90 MAE 7.0 RMSE 8.45 R2 0.95 Features Red-edge, NDVI, NDWI, climate Red-edge, NDVI, NDWI, climate Corn VWC VH, Red-edge, NDVI, NDWI, climate MAE 0.7 RMSE 1.07 R2 0.89 VH, NDVI, NDWI, climate Metric MAE 0.36 RMSE 0.48 R2 0.88 MAE 7.64 RMSE 9.44 R2 0.93 MAE 0.78 RMSE 1.17 R2 0.87 Features VH, Red-edge, NDVI, NDWI, climate VH, Red-edge, NDVI, NDWI, climate VH, VV, Red- edge, NDVI, NDWI, climate Metric MAE 0.35 RMSE 0.47 R2 0.89 MAE 5.26 RMSE 7.44 R2 0.96 MAE 0.67 RMSE 1.15 R2 0.88 Corn Height NDVI, NDWI, climate MAE 11.40 RMSE 14.99 R2 0.98 NDVI, NDWI, climate MAE 14.09 RMSE 18.22 R2 0.97 VH, VV, VHVV, RVI, Red-edge, NDVI, NDWI, climate MAE 13.92 RMSE 22.09 R2 0.95 230 Figure 4.17 depicts the chronological progression of soybean growth stages from planting to harvest, based on field observations in Michigan during the 2023 growing season (May 23 to September 20). This visual timeline captures key developmental milestones including emergence, flowering initiation and completion, pod formation and development, seed development, and maturation phases. Each stage is represented by in-situ photographs, providing a clear visual reference for the physiological and morphological changes occurring throughout the growing season. This chronological visualization serves as a valuable tool for correlating growth stages with the performance and accuracy of VWC and height estimation models, offering insights into stage-specific estimation challenges and opportunities in crop monitoring applications. Figure 4.17: Temporal Dynamics of soybean Growth Cycles: This figure illustrates the field measurements captured during the 2023 Michigan field campaign. The numbers on the images represent the day of the year (DOY) for each measurement. 231 Figure 4.18 compares estimated and actual soybean VWC across different field conditions in Michigan for 2022 and 2023, highlighting the performance of various estimation methods (RF, STL-FM, MTL-FM) throughout the growing season. The fluctuations in soybean VWC observed in certain samples, such as sample (d), highlight the difficulties in accurately estimating soybean VWC. These challenges arise from high within-field variability and complex environmental influences. Additionally, the limitations of the reference data, where three samples per pixel might not adequately capture the full range of VWC variability in the field, should be taken into account when interpreting these results. Moreover, it is important to note that VWC exhibits significant daily and sub-daily fluctuations, with potential depletion of 10-20% under non-stressed conditions and up to 35% under stress (Vermunt et al., 2022). These variations are influenced by transpiration, root water uptake, environmental factors, SM, plant stress, growth stage, and dew formation (Khabbazan et al., 2022; Vermunt et al., 2021). The VWC estimates presented in Figure 4.18 incorporated optimized feature combinations derived from Table 4.4, aimed at minimizing feature redundancy. Specifically, the STL-FM and RF models incorporate VH backscatter, NDVI, NDWI, red-edge indices, and climate features. The MTL-FM model employs the same feature set, with the exception of the red-edge index. Figure 4.18 demonstrates that the STL-FM method consistently outperforms other approaches across all samples, accurately tracking actual VWC trends throughout the growing season. In contrast, the RF model tends to underestimate VWC, particularly during critical phenological stages such as flowering and pod development. This underestimation is most pronounced in panel (d), which highlights the complex and dynamic nature of VWC in soybeans. The VWC fluctuations, particularly evident in panel (d), highlight the challenges in obtaining representative field samples for soybeans. This variability reflects the significant spatial heterogeneity within a single pixel 232 area, a common characteristic of soybean fields. These results emphasize the superiority of the STL-FM approach in capturing the nuanced VWC dynamics of soybeans across various growth stages and field conditions. Figure 4.18: Presents a comparison of estimated and actual soybean VWC across different field conditions and growing seasons. Panels (a) and (b) are related to the non-irrigated soybean fields in Michigan in 2023, Panels (c) and (d) present data from uphill and downhill pixels, respectively, located within a single soybean field in Michigan during the 2022 growing season. The plots overlay growth stages and compare actual VWC measurements with estimates from different methods (RF, STL and MTL-FM). The test dataset for soybean exhibits relatively consistent crop residue levels across all cases. They correspond to samples a, c, d, and e in Figure 4.13 which illustrates soybean VWC relationship with SAR and optical features. However, there are notable differences in SM conditions. Panel (c) represents a downhill location with higher SM content, while panel (d) depicts an uphill area within the same field, characterized by lower SM and consequently lower VWC. Despite not explicitly incorporating SR and SM data, the STL-FM method demonstrates 233 reliable performance in estimating both VWC and height, as evidenced in Figure 4.18 and 4.19. SR typically exerts its greatest influence on estimation accuracy during the early growth stages. Notably, all three methods (STL-FM, MTL-FM, and RF) perform well in estimating VWC at the beginning of the growth cycle across most cases. The exception is case (d), where overestimation occurs. This discrepancy is likely attributable to the lower SM content in this uphill location rather than SR effects. Unfortunately, we lack SM data for this sample. Therefore, we recommend incorporating SM data in future studies to fully evaluate both uphill and downhill conditions in soybean or corn fields. Figure 4.19: Comparison of estimated and actual soybean height for selected feature combinations from Table 4.4. the fields are the same as Figure 4.15. For soybean height estimation, the STL-FM method demonstrates excellent performance across nearly all growth stages. The RF method, however, tends to underestimate height for the 234 Michigan 2023 samples. Interestingly, for samples c and d, the MTL-FM model exhibits more stable estimations compared to STL-FM, although it consistently overestimates the height. Figure 4.20 illustrates the chronological progression of corn growth stages from planting to harvest, based on field observations in Michigan during the 2023 growing season (May 23 to September 20). This visual timeline captures key phenological phases of corn. Each stage is represented by in-situ photographs, providing a clear visual reference for the physiological and morphological changes occurring throughout the growing season. This chronological visualization serves as a valuable tool for correlating growth stages with the performance and accuracy of VWC and height estimation models, offering insights into stage-specific estimation challenges and opportunities in corn crop monitoring applications. Figure 4.20: Temporal Dynamics of corn Growth Cycles: This figure illustrates the field measurements captured during the 2023 Michigan field campaign. The numbers on the images represent the day of the year (DOY) for each measurement. Figure 4.21 illustrates that the RF with VH, VV, Red-edge, NDVI and NDWI features demonstrates adequate performance in estimating corn VWC up to 6-8 kg/m2. However, it 235 consistently underestimates VWC for higher values ranging from 8-10 kg/m2. This limitation suggests that RF may struggle to capture the full range of VWC variability, particularly during peak biomass periods or in high-yielding conditions. Both STL-FM and MTL-FM models demonstrate exceptional performance in predicting VWC. However, the overestimation noted during the R4 and R5 growth stages could be attributed to sub-daily fluctuations in VWC or inaccuracies in in-situ measurements, as corroborated by other samples depicted in Figures 4.21 and 4.14. Figure 4.22 illustrates corn height estimation results in which, the STL-FM outperformed the other methods and generally performs well across most growth stages. However, notable underestimations occur during the V18 and R1 stages both are related to ear shoot development. The VWC and height estimates for Iowa test datasets are provided in appendix, Figures C6 and C7. Figure 4.21: Comparison of estimated and actual corn VWC for selected feature combinations from Table 4.4. The growth stages from Figure 4.20 are indicated on the plot. Panels (a) presents label data from a non-irrigated corn field in Michigan 2023. Panels (c) and (d) display data from Michigan in 2022 from two different fields c with lower SM rate and d with higher SM rate. 236 Figure 4.22: Comparison of estimated and actual corn height using selected feature combinations from Table 4.4, with fields identical to those shown in Figure 4.21. 4.4.5. Model Generalization To test our model's generalization capability, we selected two irrigated fields (one soybean, one corn) from Michigan's 2022 measurements. These fields were excluded from the SSL phase of FM and differ significantly from the training and test datasets in management practices. They are irrigated, unlike the other fields used for training our models, and the soybean field notably features high crop residue from previous corn cultivation with minimal tillage (Figure 4.5). These distinct characteristics provide a rigorous test of the model's adaptability to varied agricultural conditions and practices. Figure 4.23 demonstrates model performance under conditions distinct from the training dataset, showcasing the generalizability of different estimation methods (STL-FM, MTL-FM, and RF) across various growth stages. Due to irrigation, the soybean field exhibits higher VWC (6 237 kg/m2) than non-irrigated fields (3-4.5 kg/m2), leading to underestimation by all models. Despite the increased influence of SR during early growth stages, both STL-FM and MTL-FM models demonstrate reliable performance. These models exhibit only slight overestimation, effectively mitigating the typically challenging impact of SR on VWC estimation in the early season. This resilience suggests that the FMs have successfully learned to account for SR effects, a significant advantage over ML traditional approaches. In height estimation, the STL-FM model demonstrates superior performance across most growth stages. However, it exhibits a notable overestimation during the unrolled trifoliate leaf stage (V2). This overestimation likely stems from the combined effects of crop residue and high surface moisture on the ground, which influence the VH backscatter from both soil and emerging crop. These factors can lead to an artificial increase in the perceived height of the young soybean plants, challenging accurate estimation during this early growth phase. The performance metrics for VWC estimation are presented in Table 4.5. These results demonstrate STL-FM's superior generalization capability, suggesting that future inclusion of irrigated datasets in training could further enhance its performance across diverse agricultural conditions and practices. Table 4.5: The performance metrics of VWC and height estimates in irrigated Soybean field from Michigan's 2022 campaign. Model Metric VWC Estimates Height Estimates STL-FM MTL-FM RF MAE RMSE R2 MAE RMSE R2 MAE RMSE R2 0.91 1.18 0.79 1.21 1.56 0.63 1.32 1.61 0.61 238 7.73 11.80 0.89 19.96 22.89 0.59 13.30 17.14 0.77 Figure 4.23: High-residue, minimally tilled irrigated soybean field from Michigan's 2022 campaign, used to test model generalizability. Figure 4.24: Irrigated corn field from Michigan's 2022 campaign, used to test model generalizability. Figure 4.24 presents the generalizability results for an irrigated corn field from Michigan’s 2022 campaign. The STL-FM and MTL-FM models significantly outperform RF in estimating both VWC and height. RF consistently underestimates VWC and height, while the FM models tend to overestimate during the V18 and R1 growth stages, possibly due to irrigation effects on VH backscatter. Performance metrics for VWC estimation are illustrated in Table 4.6. The results demonstrate that while RF underestimates crop height, it achieves the highest R² for height estimation, suggesting better correlation despite systematic underestimation. 239 Table 4.6: The performance metrics of VWC and height estimates in irrigated Corn field from Michigan's 2022 campaign. Model Metric VWC Estimates Height Estimates STL-FM MTL-FM RF MAE RMSE R2 MAE RMSE R2 MAE RMSE R2 0.83 1.03 0.85 0.64 0.80 0.91 1.71 2.06 0.41 33.83 38.07 0.68 31.46 36.81 0.70 26.26 29.10 0.81 The feature importance analysis reveals that, following precipitation, NDVI and NDWI have the most significant impact on corn and soybean parameter estimation. The observed height overestimation during the first four growth stages can be primarily attributed to anomalously high NDVI values displayed in Figure 4.13g. Moreover, On DOY 196, the overestimation is likely due to increased VH backscatter, influenced by recorded rainfall. Additionally, irrigation practices during these early stages may affect VH backscatter. This effect is particularly pronounced in these growth stages, where the smaller canopy size allows C-band signals to penetrate more effectively, leading to a stronger influence of SM on the backscatter. These factors collectively contribute to the model's tendency to overestimate crop height and VWC in the early growth stages, highlighting the complex interplay between environmental conditions, management practices, and RS observations in agricultural monitoring. Furthermore, as mentioned in section 1.4.1, an anomalous behavior is observed in Figure 4.24i for VH and VV signals on this pixel, which could contribute to this unusual overestimation. Another case for testing the generalizability of our models is a selected corn field from Florida. This field presents a significantly different scenario from the Michigan and Iowa datasets used for training the models, featuring a distinct climate, irrigation and a shorter growth cycle. 240 Notably, the reference VWC data does not show the typical reduction at harvest time (on DOY 162), possibly due to these unique growing conditions. The performance metrics for this case are illustrated in Table 4.7 and the VWC estimates using all three models are shown in Figure 4.25. Table 4.7: The performance metrics of VWC and height estimates in irrigated Corn field from Florida 2018 campaign. Model Metric VWC Estimates STL-FM MTL-FM RF MAE RMSE R2 MAE RMSE R2 MAE RMSE R2 0.84 1.02 0.62 0.79 0.98 0.65 0.83 1.25 0.44 Figure 4.25: Irrigated corn field from Florida 2018 campaign, used to test model generalizability. The results show that, both FMs outperform the RF model, with the MTL FM slightly outperforming the STL-FM. These results further demonstrate the generalizability of our FM approaches in generalizing to significantly different agricultural contexts. 241 4.5. Discussion 4.5.1. Synergy of Optical and SAR data in VWC and crop height estimation The combination of optical and SAR data, along with weather parameters, effectively addresses the limitations of optical data, particularly in high vegetation biomass where VWC exceeds 4 kg/m² (Cosh et al., 2019). Optical sensors, while valuable, are less sensitive to water content beneath the canopy surface, potentially leading to underestimation of plant water content. SAR data complements optical data by providing continuous monitoring even during cloudy periods, enhancing overall estimation accuracy. Our analysis reveals distinct patterns in data source importance for corn and soybean VWC and crop height estimation. For corn, NDWI showed the highest importance, while NDVI ranked higher for soybeans. This difference reflects the physiological and structural characteristics of these crops. Corn's higher water content and distinct canopy structure allow better penetration of shortwave infrared light used in NDWI calculations. Conversely, soybeans' more uniform canopy structure makes NDVI a better indicator of overall plant health and biomass, correlating well with water content. Notably, precipitation emerged as the highest-ranked factor for height estimation in both crops, surpassing even optical or SAR features, highlighting the critical role of water availability in plant growth. Additionally, we observed that VH polarization outperforms VV for soybeans, while both polarizations are important for corn parameter estimation. This difference can be attributed to the distinct canopy structures: soybeans' more horizontally oriented leaves interact more strongly with VH polarization (volume scattering), while corn's complex structure with vertical stalks and large leaves interacts significantly with both polarizations. 242 The fusion of SAR and optical data outperformed both optical and SAR alone in VWC and height estimation, with optical alone surpassing SAR alone. This hierarchy (fusion > optical > SAR) can be attributed to: (i) Complementary information: Optical data provides detailed spectral information about the vegetation surface, while SAR offers insights into the structure and water content throughout the canopy. (ii) Optical sensitivity to pigments: Optical indices like NDVI and NDWI is highly sensitive to chlorophyll and water content in leaves, strong indicators of overall plant health and water status. (iii) SAR complexity: SAR data, while valuable, is more complex to interpret due to its sensitivity to multiple factors including SR, SM, and canopy structure. While the fusion of SAR, optical and weather features is effective for both VWC and height estimation to maximize classification accuracy, it's vital to reduce feature redundancy and prevent overfitting. Zhang et al. (2020) highlighted this challenge, noting that relatively few studies have focused on optimal SAR feature selection. In our study, we addressed this by analyzing 22 feature combinations to identify the optimal set (Figure 4.16). The most effective combination typically included one or two SAR features (either VH alone or both VH and VV), red-edge, NDVI, NDWI, and climate features. This careful feature selection process balanced model complexity with performance, enhancing our approach's overall efficacy. 4.5.2. Foundation Models: Capabilities and Generalizability Our study marks a significant advancement in applying geospatial FMs to VWC and crop height estimation. By leveraging SSL on large, unlabeled datasets, our approach addresses limitations of traditional methods like linear regressions and the WCM. The STL-FM consistently outperformed traditional RF and XGB, particularly in capturing nuanced VWC dynamics across various growth stages. This is evidenced by lower MAE and 243 higher R² values, especially during critical phenological stages like flowering and pod/ear development. Incorporating temporal modeling into the supervised MTL and STL frameworks by adding a 1D convolutional layer after fine-tuning the pre-trained geospatial FM encoder has enhanced the performance of both MTL and STL models. Unlike traditional ML models that do not account for the temporal relationships within the data during training, 1D convolution captures temporal patterns within the training time series data. Given that part of our reference dataset (Iowa 2016) includes only 1-2 time steps, the performance of the STL-FM model is close to that of Random Forest, with STL-FM slightly outperforming RF. With a more extensive time series dataset, this performance gap could potentially widen even further. Additionally, the model is capable of making predictions using single time-step data, as well as for time series covering the entire growth cycle. Consequently, the model performed well even with the limited time-steps available in the Iowa dataset (Figures C6 and C7). The diverse training dataset, encompassing varied climatic zones (Iowa and Michigan) and management practices (including different tillage methods), significantly enhanced the model's generalizability. This was demonstrated by its successful application to Florida's distinct climate and irrigated fields, which were not included in the training data and SSL process, thus showcasing the model's reliable performance across diverse agricultural contexts. While RF showed comparable performance to FM in training and testing, FM demonstrated superior generalizability across diverse scenarios. The STL-FM showed remarkable adaptability, maintaining high accuracy even in challenging conditions such as fields with high crop residue or varying SM. Comparison of the STL-FM with existing research for VWC estimation shows our model not only outperformed them in terms of statistical metrics but also in terms of including a variety 244 of climate and field conditions. For example, Xu et al. (2020), employed a linear regression model using SMAPVEX16 data from a single time-step in August in Iowa, combined with MODIS- derived NDWI. Their approach yielded RMSE values of 1.31 kg/m² for corn and 0.94 kg/m² for soybean field canopy VWC, with corresponding R² values of 0.66 and 0.85. In contrast, our STL- FM achieved RMSE values of 0.7 and 0.3 kg/m² for corn and soybean VWC, with R² values of 0.89 and 0.9, using VH, red-edge, NDVI, NDWI, and weather parameters. Our model’s generalization results (soybean RMSE of 1.18, R² of 0.79; corn RMSE of 1.03, R² of 0.85) also surpassed those of Xu et al. (2020), demonstrating the capability of SSL with FMs and the impact of feature fusion on improvement. The results for height estimation were also impressive, with corn RMSE of 15 cm and R² of 0.98, and soybean RMSE of 7.4 cm and R² of 0.95. However, we observed some limitations, particularly in early growth stages and in estimating extreme soybean VWC values (>4.5 kg/m²) and corn VWC values (>8 kg/m²). These challenges highlight areas for future research, possibly through the incorporation of irrigated fields. Additionally, the MTL approach did not consistently outperform STL in our experiments. MTL is typically beneficial when leveraging information from related tasks can compensate for limited data in one of the tasks. However, in our case, the geospatial FM effectively mitigates the data scarcity issue by utilizing SSL to enhance feature representation learning. Consequently, the inherent advantage of MTL in leveraging task interdependencies may not be as pronounced. This outcome is expected; MTL seeks to optimize performance across multiple tasks, while STL is fine- tuned for specific tasks. Consequently, STL may perform better in scenarios requiring specialized solutions, as MTL might compromise on individual task efficacy to achieve average performance across tasks. 245 4.5.3. Surface roughness and soil moisture consideration Previous studies, including those by McNairn et al. (2001) and Smith and Major. (1996), have shown that crop residue significantly affects backscatter, particularly when they are wet. This is relevant to situations such as irrigated soybean in Michigan 2022 (Figure 4.5) or cover crops in non-irrigated corn in Michigan 2023 (Figure 7C1). Cross-polarizations are especially sensitive to residue amounts. Therefore, VH signal reflected from the soybean field used for generalization has been affected by crop residue. While SR and SM are known to influence SAR backscatter, especially for higher frequencies like L-band, incorporating these measurements for large-scale VWC estimation presents significant challenges. Our approach prioritizes freely available and easily accessible data sources such as Sentinel-1A, Sentinel-2, and weather information. Despite not explicitly including SR and SM inputs, our models achieved reliable VWC and height estimates. This suggests that while these factors contribute to backscatter, their impact may be secondary to that of SAR, optical, and precipitation data in VWC and height estimation using FMs (was illustrated in feature importance Figure 4.15). The STL-FM demonstrated a remarkable ability to account for SR effects, outperforming traditional ML approaches in estimating VWC and height for irrigated soybean in Michigan 2022 (Figure 4.23). Our key point is that providing SM and SR as inputs to the model for estimating VWC and height would be challenging for future applications. Instead, by providing reference VWC data from more diverse fields with different management practices (irrigation and tillage methods), we can train the model without directly incorporating SR and SM, making it more widely applicable. However, it's worth noting that while SR and SM may have limited impact on C-band data used in this study, they could become more significant when using L-band SAR, potentially introducing 246 additional complexity. Future research might explore the integration of these factors, particularly for L-band applications, while balancing the trade-off between model complexity and data availability. 4.5.4. In-situ measurement refinement The accuracy of our model training and evaluation is inherently tied to the quality of reference VWC and height data. Several factors contribute to potential uncertainties in these measurements: 1. Diurnal VWC Fluctuations: VWC can vary by 10-20% daily under normal conditions, increasing to 35% under stress (Vermunt et al., 2022). The time gap between field measurements (typically morning to afternoon) and Sentinel-1A overpass (night) may introduce discrepancies due to transpiration-induced VWC reduction throughout the day. 2. Dew Effects: While efforts were made to dry samples before measurement, dew presence during early morning collections, particularly in irrigated fields, could affect VWC readings. The nighttime Sentinel-1A overpass mitigates this issue, but future studies using different SAR data should consider dew impact based on sensor revisit times. 3. Spatial Heterogeneity: Some crops, like soybeans, exhibit high variability within a 50- meter grid pixel. To better represent this heterogeneity, future studies should consider increasing the number of samples per pixel, potentially exceeding the current 10-sample approach. Addressing these factors in future data collection protocols could significantly enhance the robustness of reference data, thereby improving model training and evaluation accuracy. 4.6. Conclusion This study represents a significant leap forward in the application of FMs for estimating VWC and crop height. By harnessing the power of SSL on large, unlabeled datasets, our approach 247 effectively addresses the limitations of traditional methods, offering superior performance across diverse agricultural settings. Our STL-FM consistently outperformed MTL-FM and conventional ML techniques, demonstrating remarkable adaptability across varied climatic zones and management practices. The fusion of Sentinel-1A C-band SAR, Sentinel-2 optical data, and weather information proved particularly effective, overcoming challenges such as saturation in high biomass regions and reduced sensitivity to sub-canopy moisture. The study revealed important insights into the synergistic effects of combining different data sources. The VH polarization, alongside optical indices like red-edge, NDVI and NDWI, emerged as crucial predictors for VWC, with precipitation playing a surprisingly significant role in height estimation. This multi-sensor approach effectively addresses the limitations of individual data sources, providing a more comprehensive view of crop conditions. Our models demonstrated reliable performance even in challenging scenarios, such as fields with high crop residue or varying SM. This resilience suggests that geospatial FMs have implicitly learned to account for complex environmental interactions, potentially reducing the need for explicit inclusion of factors like surface roughness and soil moisture in large-scale applications. While the study achieved impressive results, particularly in estimating VWC and height for corn and soybeans, it also highlighted areas for future research. These include refining the model's performance for extreme VWC values specifically in early growth stages. Looking ahead, the integration of Sentinel-1A C-band and the upcoming NISAR L-band SAR data presents exciting opportunities for enhancing crop monitoring capabilities. Future research should also focus on expanding the application of geospatial FMs to larger areas with diverse crop types and climate. 248 The advancements presented in this study have significant implications for precision agriculture. By providing more accurate and timely information on crop water content and height, these models can support more efficient resource use and potentially higher yields. As we continue to refine and expand these techniques, we move closer to developing universal crop monitoring systems capable of adapting to diverse agricultural contexts and supporting sustainable farming practices worldwide. 249 REFERENCES Alaska https://search.asf.alaska.edu/#/ (accessed 4.6.24). Satellite Facility (ASF), 2023. Copernicus Sentinel data. URL Attema, E.P.W., Ulaby, F.T., 1978. Vegetation modeled as a water cloud. Radio Sci 13, 357–364. https://doi.org/10.1029/RS013i002p00357 Ayush, K., Uzkent, B., Meng, C., Tanmay, K., Burke, M., Lobell, D., Ermon, S., 2021. Geography- aware self-supervised learning, in: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 10181–10190. https://doi.org/10.48550/arXiv.2011.09980 Bastani, F., Wolters, P., Gupta, R., Ferdinando, J., Kembhavi, A., 2023. Satlaspretrain: A large- scale dataset for remote sensing image understanding, in: Proceedings of the IEEE/CVF 16772–16782. Computer International https://doi.org/10.48550/arXiv.2211.15660 Conference Vision. pp. on Belgiu, M., Drăguţ, L., 2016. Random forest in remote sensing: A review of applications and future sensing 114, 24–31. directions. https://doi.org/10.1016/j.isprsjprs.2016.01.011 journal of photogrammetry and remote ISPRS Bountos, N.I., Ouaknine, A., Rolnick, D., 2023. FoMo-Bench: a multi-modal, multi-scale and multi-task Forest Monitoring Benchmark for remote sensing foundation models. arXiv preprint arXiv:2312.10114. https://doi.org/10.48550/arXiv.2312.10114 Breiman, https://link.springer.com/article/10.1023/A:1010933404324 Random forests. 2001. L., Mach Learn 45, 5–32. Ceccato, P., Flasse, S., Tarantola, S., Jacquemoud, S., Grégoire, J.-M., 2001. Detecting vegetation leaf water content using reflectance in the optical domain. Remote Sens Environ 77, 22–33. https://doi.org/https://doi.org/10.1016/S0034-4257(01)00191-2 Cha, K., Seo, J., Lee, T., 2023. A billion-scale foundation model for remote sensing images. arXiv preprint arXiv:2304.05215. https://doi.org/10.48550/arXiv.2304.05215 Chen, D., Huang, J., Jackson, T.J., 2005. Vegetation water content estimation for corn and soybeans using spectral indices derived from MODIS near- and short-wave infrared bands. Remote Sens Environ 98, 225–236. https://doi.org/https://doi.org/10.1016/j.rse.2005.07.008 Chen, T., Guestrin, C., 2016. Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining. pp. 785– 794. https://doi.org/10.1145/2939672.2939785 Chen, T., Kornblith, S., Norouzi, M., Hinton, G., 2020. A simple framework for contrastive learning of visual representations, in: International Conference on Machine Learning. PMLR, pp. 1597–1607. http://proceedings.mlr.press/v119/chen20j.html 250 Colliander, A., Jackson, T.J., Bindlish, R., Chan, S., Das, N., Kim, S.B., Cosh, M.H., Dunbar, R.S., Dang, L., Pashaian, L., 2017. Validation of SMAP surface soil moisture products with core validation sites. Remote Sens Environ 191, 215–231. https://doi.org/10.1016/j.rse.2017.01.021 Cong, Y., Khanna, S., Meng, C., Liu, P., Rozi, E., He, Y., Burke, M., Lobell, D., Ermon, S., 2022. Satmae: Pre-training transformers for temporal and multi-spectral satellite imagery. Adv Neural Inf 197–211. Syst https://proceedings.neurips.cc/paper_files/paper/2022/file/01c561df365429f33fcd7a7faa44c985- Paper-Conference.pdf Process 35, Coopersmith, E.J., Cosh, M.H., Petersen, W.A., Prueger, J., Niemeier, J.J., 2015. Soil moisture model calibration and validation: An ARS watershed on the South Fork Iowa River. J Hydrometeorol 16, 1087–1101. https://doi.org/10.1175/JHM-D-14-0145.1 Cosh, M.H., White, W.A., Colliander, A., Jackson, T.J., Prueger, J.H., Hornbuckle, B.K., Hunt, E.R., McNairn, H., Powers, J., Walker, V.A., Bullock, P., 2019. Estimating vegetation water content during the Soil Moisture Active Passive Validation Experiment 2016. J Appl Remote Sens 13, 1. https://doi.org/10.1117/1.JRS.13.014516 Dosovitskiy, A., 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929. https://doi.org/10.48550/arXiv.2010.11929 Fensholt, R., Sandholt, I., 2003. Derivation of a shortwave infrared water stress index from MODIS near- and shortwave infrared data in a semiarid environment. Remote Sens Environ 87, 111–121. https://doi.org/10.1016/j.rse.2003.07.002 Ferrazzoli, P., Paloscia, S., Pampaloni, P., Schiavon, G., Sigismondi, S., Solimini, D., 1997. The potential of multifrequency polarimetric SAR in assessing agricultural and arboreous biomass. IEEE 5–17. Transactions https://doi.org/10.1109/36.551929 Geoscience Sensing Remote and 35, on Friedman, J.H., 2001. Greedy function approximation: a gradient boosting machine. Ann Stat 1189–1232. https://www.jstor.org/stable/2699986 Friesen, J., Steele-Dunne, S.C., Giesen, N. van de, 2012. Diurnal Differences in Global ERS Scatterometer Backscatter Observations of the Land Surface. IEEE Transactions on Geoscience and Remote Sensing 50, 2595–2602. https://doi.org/10.1109/TGRS.2012.2193889 Fuller, A., Millard, K., Green, J.R., 2023. CROMA: Remote Sensing Representations with Contrastive Radar-Optical Masked Autoencoders. arXiv preprint arXiv:2311.00566. https://proceedings.neurips.cc/paper_files/paper/2023/file/11822e84689e631615199db3b75cd0e 4-Paper-Conference.pdf Gao, Y., Walker, J.P., Allahmoradi, M., Monerris, A., Ryu, D., Jackson, T.J., 2015. Optical Sensing of Vegetation Water Content: A Synthesis Study. IEEE J Sel Top Appl Earth Obs Remote Sens 8, 1456–1464. https://doi.org/10.1109/JSTARS.2015.2398034 Garnot, V.S.F., Landrieu, L., Chehata, N., 2022. Multi-modal temporal attention models for crop 251 mapping from satellite time series. ISPRS Journal of Photogrammetry and Remote Sensing 187, 294–305. https://doi.org/10.1016/j.isprsjprs.2022.03.012 He, K., Fan, H., Wu, Y., Xie, S., Girshick, R., 2020. Momentum contrast for unsupervised visual representation learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 9729–9738. https://doi.org/10.48550/arXiv.1911.05722 Hosseini, M., McNairn, H., Merzouki, A., Pacheco, A., 2015. Estimation of Leaf Area Index (LAI) in corn and soybeans using multi-polarization C-and L-band radar data. Remote Sens Environ 170, 77–89. https://doi.org/10.1016/j.rse.2015.09.002 Huang, J., Chen, D., Cosh, M.H., 2009. Sub‐pixel reflectance unmixing in estimating vegetation water content and dry biomass of corn and soybeans cropland using normalized difference water index 2075–2104. https://doi.org/10.1080/01431160802549245 satellites. (NDWI) Remote from Sens 30, Int J Huang, Y., Walker, J.P., Gao, Y., Wu, X., Monerris, A., 2016. Estimation of Vegetation Water Content From the Radar Vegetation Index at L-Band. IEEE Transactions on Geoscience and Remote Sensing 54, 981–989. https://doi.org/10.1109/TGRS.2015.2471803 Hunt, E.R., Li, L., Yilmaz, M.T., Jackson, T.J., 2011. Comparison of vegetation water contents derived from shortwave-infrared and passive-microwave sensors over central Iowa. Remote Sens Environ 115, 2376–2383. https://doi.org/10.1016/j.rse.2011.04.037 Hunt Jr, E.R., Li, L., Friedman, J.M., Gaiser, P.W., Twarog, E., Cosh, M.H., 2018. Incorporation of stem water content into vegetation optical depth for crops and woodlands. Remote Sens 10, 273. https://doi.org/10.3390/rs10020273 Ienco, D., Gaetano, R., Dupaquier, C., Maurel, P., 2017. Land cover classification via multitemporal spatial data by deep recurrent neural networks. IEEE Geoscience and Remote Sensing Letters 14, 1685–1689. https://doi.org/10.1109/LGRS.2017.2728698 Jackson, T.J., Chen, D., Cosh, M., Li, F., Anderson, M., Walthall, C., Doriaswamy, P., Hunt, E.R., 2004. Vegetation water content mapping using Landsat data derived normalized difference water 475–482. Remote and index https://doi.org/https://doi.org/10.1016/j.rse.2003.10.021 soybeans. Environ Sens corn 92, for Jackson, T.J., Schmugge, T.J., Wang, J.R., 1982. Passive microwave sensing of soil moisture under 1137–1142. https://doi.org/10.1029/WR018i004p01137 vegetation canopies. Resour Water Res 18, Jacquemoud, S., Verhoef, W., Baret, F., Bacour, C., Zarco-Tejada, P.J., Asner, G.P., François, C., Ustin, S.L., 2009. PROSPECT+SAIL models: A review of use for vegetation characterization. Remote Sens Environ 113, S56–S66. https://doi.org/https://doi.org/10.1016/j.rse.2008.01.026 Jain, P., Schoen-Phelan, B., Ross, R., 2022. Self-supervised learning for invariant representations from multi-spectral and SAR images. IEEE J Sel Top Appl Earth Obs Remote Sens 15, 7797– 7808. https://doi.org/10.1109/JSTARS.2022.3204888 252 Jakubik, J., Roy, S., Phillips, C.E., Fraccaro, P., Godwin, D., Zadrozny, B., Szwarcman, D., Gomes, C., Nyirjesy, G., Edwards, B., 2023. Foundation Models for Generalist Geospatial Artificial Intelligence. arXiv preprint arXiv:2310.18660. https://doi.org/10.1038/s41586-023- 05881-4 Judge, J., Liu, P.-W., Monsiváis-Huertero, A., Bongiovanni, T., Chakrabarti, S., Steele-Dunne, S.C., Preston, D., Allen, S., Bermejo, J.P., Rush, P., DeRoo, R., Colliander, A., Cosh, M., 2021. Impact of vegetation water content information on soil moisture retrievals in agricultural regions: An analysis based on the SMAPVEX16-MicroWEX dataset. Remote Sens Environ 265, 112623. https://doi.org/https://doi.org/10.1016/j.rse.2021.112623 Khabbazan, S., Steele-Dunne, S.C., Vermunt, P., Judge, J., Vreugdenhil, M., Gao, G., 2022. The influence of surface canopy water on the relationship between L-band backscatter and biophysical variables 112789. agricultural monitoring. Remote https://doi.org/https://doi.org/10.1016/j.rse.2021.112789 Environ Sens 268, in Khabbazan, S., Vermunt, P., Steele-Dunne, S., Ratering Arntz, L., Marinetti, C., van der Valk, D., Iannini, L., Molijn, R., Westerdijk, K., van der Sande, C., 2019. Crop monitoring using Sentinel- 1887. 1 https://doi.org/10.3390/rs11161887 from The Netherlands. Remote Sens data: A study case 11, Kim, S.-B., Huang, H., Liao, T.-H., Colliander, A., 2018. Estimating Vegetation Water Content and Soil Surface Roughness Using Physical Models of L-Band Radar Scattering for Soil Moisture Retrieval. Remote Sens 10, 556. https://doi.org/10.3390/rs10040556 Kim, Y., Jackson, T., Bindlish, R., Lee, H., Hong, S., 2011. Radar vegetation index for estimating the vegetation water content of rice and soybean. IEEE Geoscience and Remote Sensing Letters 9, 564–568. https://doi.org/10.1109/LGRS.2011.2174772 Konings, A.G., Rao, K., Steele‐Dunne, S.C., 2019. Macro to micro: microwave remote sensing of plant water content for physiology and ecology. New Phytologist 223, 1166–1172. https://doi.org/10.1111/nph.15808 Kumar, K., Hari Prasad, K.S., Arora, M.K., 2012. Estimation of water cloud model vegetation parameters using a genetic algorithm. Hydrological Sciences Journal 57, 776–789. https://doi.org/10.1080/02626667.2012.678583 Lawrence, H., Wigneron, J.-P., Demontoux, F., Mialon, A., Kerr, Y.H., 2013. Evaluating the semiempirical H-Q model used to calculate the L-band emissivity of a rough bare soil. IEEE Transactions 4075–4084. and https://doi.org/10.1109/TGRS.2012.2226995 Geoscience Sensing Remote 51, on Lee, J.-S., Ainsworth, T.L., 2010. The effect of orientation angle compensation on coherency matrix and polarimetric target decompositions. IEEE Transactions on Geoscience and Remote Sensing 49, 53–64. https://doi.org/10.1109/TGRS.2010.2048333 Lee, J.-S., Pottier, E., 2017. Polarimetric radar imaging: from basics to applications. CRC press. https://doi.org/10.1201/9781420054989 253 Li, W., Chen, K., Chen, H., Shi, Z., 2021. Geographical knowledge-driven representation learning for remote sensing images. IEEE Transactions on Geoscience and Remote Sensing 60, 1–16. https://doi.org/10.1109/TGRS.2021.3115569 Liaw, A., Wiener, M., 2002. Classification and regression by randomForest. R news 2, 18–22. https://journal.r-project.org/articles/RN-2002-022/RN-2002-022.pdf Liu, P.-W., de Roo, R.D., England, A.W., Judge, J., 2013. Impact of Moisture Distribution Within the Sensing Depth on L- and C-Band Emission in Sandy Soils. IEEE J Sel Top Appl Earth Obs Remote Sens 6, 887–899. https://doi.org/10.1109/JSTARS.2012.2213239 Lundberg, S.M., Lee, S.-I., 2017. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst 30. https://doi.org/10.48550/arXiv.1705.07874 Mall, U., Hariharan, B., Bala, K., 2023. Change-Aware Sampling and Contrastive Learning for Satellite Images, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5261–5270. http://dx.doi.org/10.1109/CVPR52729.2023.00509 Manas, O., Lacoste, A., Giró-i-Nieto, X., Vazquez, D., Rodriguez, P., 2021. Seasonal contrast: Unsupervised pre-training from uncurated remote sensing data, in: Proceedings of the IEEE/CVF International 9414–9423. Conference http://dx.doi.org/10.1109/ICCV48922.2021.00928 Computer Vision. pp. on McNairn, H., Duguay, C., Boisvert, J., Huffman, E., Brisco, B., 2001. Defining the sensitivity of multi-frequency and multi-polarized radar backscatter to post-harvest crop residue. Canadian Journal of Remote Sensing 27, 247–263. https://doi.org/10.1080/07038992.2001.10854941 Michael H. Cosh, 2010. Vegetation water content mapping in a diverse agricultural landscape: National Airborne Field Experiment 2006. J Appl Remote Sens 4, 043532. https://doi.org/10.1117/1.3449090 Muhtar, D., Zhang, X., Xiao, P., Li, Z., Gu, F., 2023. CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding. IEEE Transactions on Geoscience and Remote Sensing. https://doi.org/10.1109/TGRS.2023.3268232 Nasirzadehdizaji, R., Balik Sanli, F., Abdikan, S., Cakir, Z., Sekertekin, A., Ustuner, M., 2019. Sensitivity analysis of multi-temporal Sentinel-1 SAR parameters to crop height and canopy coverage. Applied Sciences 9, 655. https://doi.org/10.3390/app9040655 Oh, Y., Kay, Y.C., 1998. Condition for precise measurement of soil surface roughness. IEEE transactions on geoscience and remote sensing 36, 691–695. https://doi.org/10.1109/36.662751 Oki, T., Kanae, S., 2006. Global hydrological cycles and world water resources. Science (1979) 313, 1068–1072. https://doi.org/10.1126/science.1128845 Pierdicca, N., Pulvirenti, L., Bignami, C., 2010. Soil moisture estimation over vegetated terrains using multitemporal sensing data. Remote Sens Environ 114, 440–448. remote https://doi.org/https://doi.org/10.1016/j.rse.2009.10.001 254 Prexl, J., Schmitt, M., 2023. Multi-Modal Multi-Objective Contrastive Learning for Sentinel-1/2 Imagery, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2135–2143. https://doi.org/10.1109/CVPRW59228.2023.00207 Ruder, S., 2017. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098. https://doi.org/10.48550/arXiv.1706.05098 Saadat, M., Seydi, S.T., Hasanlou, M., Homayouni, S., 2022. A Convolutional Neural Network Method for Rice Mapping Using Time-Series of Sentinel-1 and Sentinel-2 Imagery. Agriculture 12, 2083. https://doi.org/10.3390/agriculture12122083 Scheibenreif, L., Hanna, J., Mommert, M., Borth, D., 2022. Self-supervised vision transformers for land-cover segmentation and classification, in: Proceedings of the IEEE/CVF Conference on Computer 1422–1431. https://doi.org/10.1109/CVPRW56347.2022.00148 Recognition. Pattern Vision and pp. Smith, A.M., Major, D.J., 1996. Radar backscatter and crop residues. Canadian journal of remote sensing 22, 243–247. https://doi.org/10.1080/07038992.1996.10855179 Soudani, K., le Maire, G., Dufrêne, E., François, C., Delpierre, N., Ulrich, E., Cecchini, S., 2008. Evaluation of the onset of green-up in temperate deciduous broadleaf forests derived from Moderate Resolution Imaging Spectroradiometer (MODIS) data. Remote Sens Environ 112, 2643–2655. https://doi.org/10.1016/j.rse.2007.12.004 Steele-Dunne, S.C., McNairn, H., Monsivais-Huertero, A., Judge, J., Liu, P.-W., Papathanassiou, K., 2017. Radar remote sensing of agricultural canopies: A review. IEEE J Sel Top Appl Earth Obs Remote Sens 10, 2249–2273. https://doi.org/10.1109/JSTARS.2016.2639043 Sun, X., Wang, P., Lu, W., Zhu, Z., Lu, X., He, Q., Li, J., Rong, X., Yang, Z., Chang, H., 2022. RingMo: A remote sensing foundation model with masked image modeling. IEEE Transactions on Geoscience and Remote Sensing. https://doi.org/10.1109/TGRS.2022.3194732 Togliatti, K., Lewis-Beck, C., Walker, V.A., Hartman, T., VanLoocke, A., Cosh, M.H., Hornbuckle, B.K., 2022. Quantitative Assessment of Satellite L-Band Vegetation Optical Depth IEEE Geoscience and Remote Sensing Letters 19, 1–5. in https://doi.org/10.1109/LGRS.2020.3034174 the U.S. Corn Belt. Tsaris, A., Dias, P.A., Potnis, A., Yin, J., Wang, F., Lunga, D., 2024. Pretraining Billion-scale Geospatial Foundational Models arXiv:2404.11706. on Frontier. https://doi.org/10.48550/arXiv.2404.11706 preprint arXiv Vermunt, P.C., Khabbazan, S., Steele-Dunne, S.C., Judge, J., Monsivais-Huertero, A., Guerriero, L., Liu, P.-W., 2021. Response of Subdaily L-Band Backscatter to Internal and Surface Canopy Water Dynamics. IEEE Transactions on Geoscience and Remote Sensing 59, 7322–7337. https://doi.org/10.1109/TGRS.2020.3035881 Vermunt, P.C., Steele‐Dunne, S.C., Khabbazan, S., Judge, J., van de Giesen, N.C., 2022. Extrapolating continuous vegetation water content to understand sub-daily backscatter variations. 255 Hydrol Earth Syst Sci. https://doi.org/10.5194/hess-26-1223-2022, 2022 Vreugdenhil, M., Wagner, W., Bauer-Marschallinger, B., Pfeil, I., Teubner, I., Rüdiger, C., Strauss, P., 2018. Sensitivity of Sentinel-1 Backscatter to Vegetation Dynamics: An Austrian Case Study. Remote Sens 10, 1396. https://doi.org/10.3390/rs10091396 Walker, V.A., Wallace, V., Yildirim, E., Eichinger, W.E., Cosh, M.H., Hornbuckle, B.K., 2023. From field observations to temporally dynamic soil surface roughness retrievals in the US Corn Belt. Remote Sens Environ 287, 113458. https://doi.org/10.1016/j.rse.2023.113458 Wang, D., Zhang, Q., Xu, Y., Zhang, J., Du, B., Tao, D., Zhang, L., 2022. Advancing plain vision transformer toward remote sensing foundation model. IEEE Transactions on Geoscience and Remote Sensing 61, 1–15. https://doi.org/10.1109/TGRS.2022.3222818 Wang, L., Hunt Jr, E.R., Qu, J.J., Hao, X., Daughtry, C.S.T., 2013. Remote sensing of fuel moisture content from ratios of narrow-band vegetation water and dry-matter indices. Remote Sens Environ 129, 103–110. https://doi.org/10.1016/j.rse.2012.10.027 Wang, Y., Albrecht, C.M., Zhu, X.X., 2022. Self-supervised vision transformers for joint SAR- optical representation learning, in: IGARSS 2022-2022 IEEE International Geoscience and Remote 139–142. https://doi.org/10.1109/IGARSS46834.2022.9883983 Symposium. Sensing IEEE, pp. Wang, Y., Zhang, T., Zhao, L., Hu, L., Wang, Zhechao, Niu, Z., Cheng, P., Chen, K., Zeng, X., Wang, Zhirui, 2023. RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN- Transformer arXiv:2309.09003. Hybrid https://doi.org/10.48550/arXiv.2309.09003 Framework. preprint arXiv Xu, C., Qu, J.J., Hao, X., Cosh, M.H., Zhu, Z., Gutenberg, L., 2020. Monitoring crop water content for corn and soybean fields through data fusion of MODIS and Landsat measurements in Iowa. Agric Water Manag 227, 105844. https://doi.org/https://doi.org/10.1016/j.agwat.2019.105844 Xu, F., Jin, Y.-Q., 2005. Deorientation theory of polarimetric scattering targets and application to terrain surface classification. IEEE Transactions on Geoscience and Remote Sensing 43, 2351– 2364. https://doi.org/10.1109/TGRS.2005.855064 Xu, Y., Ma, Y., Zhang, Z., 2024. Self-supervised pre-training for large-scale crop mapping using Sentinel-2 time series. ISPRS Journal of Photogrammetry and Remote Sensing 207, 312–325. https://doi.org/10.1016/j.isprsjprs.2023.12.005 Ye, N., Walker, J.P., Wu, X., de Jeu, R., Gao, Y., Jackson, T.J., Jonard, F., Kim, E., Merlin, O., Pauwels, V.R.N., Renzullo, L.J., Rudiger, C., Sabaghy, S., von Hebel, C., Yueh, S.H., Zhu, L., 2021. The Soil Moisture Active Passive Experiments: Validation of the SMAP Products in Australia. IEEE Transactions on Geoscience and Remote Sensing 59, 2922–2939. https://doi.org/10.1109/TGRS.2020.3007371 Yihyun Kim, Jackson, T., Bindlish, R., Hoonyol Lee, Sukyoung Hong, 2012. Radar Vegetation Index for Estimating the Vegetation Water Content of Rice and Soybean. IEEE Geoscience and 256 Remote Sensing Letters 9, 564–568. https://doi.org/10.1109/LGRS.2011.2174772 Yihyun Kim, Jackson, T., Bindlish, R., Sukyoung Hong, Gunho Jung, Kyuongdo Lee, 2014. Retrieval of Wheat Growth Parameters With Radar Vegetation Indices. IEEE Geoscience and Remote Sensing Letters 11, 808–812. https://doi.org/10.1109/LGRS.2013.2279255 Yuan, Y., Lin, L., Zhou, Z.-G., Jiang, H., Liu, Q., 2023. Bridging optical and SAR satellite image time series via contrastive feature extraction for crop classification. ISPRS Journal of Photogrammetry 222–232. https://doi.org/https://doi.org/10.1016/j.isprsjprs.2022.11.020 Sensing Remote 195, and Zhang, Q., Li, L., Sun, R., Zhu, D., Zhang, C., Chen, Q., 2020. Retrieval of the soil salinity from Sentinel-1 Dual-Polarized SAR data based on deep neural network regression. IEEE Geoscience and Remote Sensing Letters 19, 1–5. http://dx.doi.org/10.1109/LGRS.2020.3041059 Zhang, T., Gao, P., Dong, H., Zhuang, Y., Wang, G., Zhang, W., Chen, H., 2022. Consecutive pre- training: A knowledge transfer learning strategy with relevant unlabeled data for remote sensing domain. Remote Sens 14, 5675. https://doi.org/10.3390/rs14225675 Zhou, Y., Luo, J., Feng, L., Zhou, X., 2019. DCN-based spatial features for improving parcel- based crop classification using high-resolution optical images and multi-temporal SAR data. Remote Sens 11, 1619. http://dx.doi.org/10.3390/rs11131619 257 APPENDIX Table C1: Computed geometric roughness parameters using Pinboard observations from Michigan 2023 field campaign. Pixel Id DOY 𝜎𝑠, mm 𝑙𝑐, mm standard error in-row across-row in-row across-row in-row across-row Corn1-1 Corn1-2 Corn2 Corn3-1 Corn3-2 soy1-1 soy1-2 soy2-1 soy2-2 soy3-1 soy3-2 142 166 201 142 166 201 142 166 201 142 166 201 142 166 201 142 166 201 142 166 201 142 166 201 142 166 201 142 166 201 142 166 201 - 4.10 4.35 5.01 6.19 - 3.67 4.55 - 2.56 4.37 - 3.69 2.81 - 9.67 4.72 - 3.04 3.56 - 4.91 7.56 5.91 5.79 4.41 3.95 3.81 3.84 - - 4.32 - - 7.36 6.09 9.35 8.29 - 5.94 7.68 - 7.08 7.39 - 8.96 11.00 - 5.71 7.55 - 8.26 9.64 - 8.53 5.83 6.77 8.65 7.97 8.30 9.44 11.18 - - 8.54 - - 43.33 23.33 36.66 23.33 - 30.00 46.67 - 36.67 10.00 - 43.33 56.67 - 10.00 30.00 - 46.67 53.33 - 10.00 23.33 43.33 13.33 20.00 36.67 40.00 60.00 - - 16.00 - - 0.26 0.10 0.43 0.64 - 0.16 0.42 - 0.24 0.52 - 0.45 0.23 - 1.43 0.16 - 0.10 0.21 - 0.42 0.53 0.67 1.15 0.07 0.20 0.47 0.37 - - 0.45 - - 0.43 0.53 1.71 0.91 - 0.33 0.91 - 0.76 1.08 - 0.07 2.16 - 0.58 0.39 - 0.84 1.05 - 0.13 0.37 1.08 0.45 0.72 1.11 0.76 1.60 - - 0.92 - - 33.33 33.33 10.00 50.00 - 10.00 33.33 - 16.67 33.33 - 20.00 36.67 - 43.33 26.67 - 20.00 26.67 - 30.00 46.67 30.00 20.00 23.33 33.33 10.00 10.00 - - 13.00 - 258 Table C2: Computed geometric roughness parameters using Pinboard observations from SMAPVEX16 field campaign. Taken from (Walker et al., 2023). Field Name DOY 𝜎𝑠, mm 𝑙𝑐, mm In-row Across-row In-row Across-row JPL flux LTAR south NA01 NA04 S02 S03 S04 S09 S10 155 146 157 149 155 153 150 149 150 147 157 151 147 157 151 S19 147 S21 154 S32 155 S36 151 SF02 153 SF07 150 SF10 154 SF15 SF21 flux north 154 SF21 flux south 146 S11 S14 9 ± 1 8 ± 2 8 ± 1 15 ± 1 9 ± 1 14 ± 1 17 ± 2 7 ± 1 11 ± 1 10 ± 2 9 ± 1 8 ± 1 8 ± 1 10 ± 1 9 ± 1 10 ± 1 7 ± 1 12 ± 1 15 ± 3 6 ± 1 7 ± 1 11 ± 1 11 ± 1 10 ± 3 15 ± 1 12 ± 1 8 ± 1 15 ± 1 14 ± 1 16 ± 2 15 ± 1 13 ± 1 18 ± 1 14 ± 2 11 ± 1 13 ± 2 14 ± 1 10 ± 1 13 ± 1 14 ± 1 12 ± 1 17 ± 2 14 ± 1 11 ± 1 21 ± 2 14 ± 1 17 ± 2 13 ± 3 54 ± 5 59 ± 11 60 ± 7 21 ± 6 54 ± 7 30 ± 4 15 ± 6 62 ± 5 43 ± 6 51 ± 9 58 ± 6 64 ± 5 61 ± 7 51 ± 5 52 ± 5 53 ± 8 68 ± 4 38 ± 7 35 ± 7 69 ± 3 65 ± 2 42 ± 5 36 ± 3 23 ± 4 44 ± 7 64 ± 4 18 ± 4 30 ± 5 23 ± 7 21 ± 6 28 ± 6 12 ± 4 30 ± 9 52 ± 6 33 ± 8 30 ± 4 51 ± 8 32 ± 8 27 ± 7 41 ± 7 19 ± 6 34 ± 5 44 ± 5 9 ± 6 30 ± 4 19 ± 6 259 Figure C1: Relationship between crop height and various SAR and optical features for soybean crops. The features and field conditions are identical to those described in Figure 4.13, with crop height replacing VWC as the measured parameter. Panels (a-g) correspond to the same fields and conditions as in Figure 4.13. These scatter plots illustrate how soybean height correlates with various remote sensing parameters across different field conditions, irrigation status, topographies, and management practices. Figure C2: Relationship between crop height and SAR and optical features for Corn crops. Features include SAR-derived (VH, VV, RVI), optical (Red-edge, NDVI, NDWI), and soil moisture (SM) parameters. Panels (a-h) correspond to the same fields and conditions as in Figure 4.15. These plots demonstrate the complex relationships between crop height and SAR and optical features across various field conditions and irrigation status. 260 Figure C3: Performance metrics for corn VWC estimation using different feature combinations and model types. Figure C4: Performance metrics for soybean Height estimation using different feature combinations and model types. 261 MAERMSER2MAERMSER2MAERMSER2MAERMSER2VH,VV,VH/VV,RVI,ap1.802.240.531.492.110.581.542.080.591.682.250.52VH,VV1.552.050.601.401.890.661.452.000.621.762.300.50VH, VH/VV,RVI1.442.000.621.221.820.691.482.150.561.642.330.49VH,VV,VH/VV,RVI,H,E,alpha1.431.970.631.662.210.541.652.210.541.772.270.51VH, VHVV1.341.950.641.332.020.611.472.140.571.862.460.43VH, RVI1.311.930.651.622.140.591.472.140.571.862.460.43VH,VV,P,tmin,tmax,H,E,alpha1.241.800.711.341.800.690.901.430.810.991.460.81VH,VV,VH/VV,RVI1.151.820.691.692.240.521.492.120.581.692.290.50VH, VH/VV,RVI,P,tmin,tmax,H,E,alpha1.141.620.751.231.670.740.871.400.821.081.560.77VH,VV,VH/VV,RVI,P,tmin, tmax0.971.30.841.041.460.80.831.350.830.911.330.83VH, Red-edge,NDVI, NDWI,P,tmin, tmax,H,E,alpha0.941.410.810.971.400.810.791.270.850.821.160.87VH, VH/VV,RVI,P,tmin,tmax0.921.350.830.931.400.810.801.300.840.881.350.83NDVI, NDWI0.871.230.860.931.340.830.911.310.840.841.280.85VH,VV, NDVI, NDWI,P,tmin,tmax0.861.260.850.851.210.860.771.200.860.801.180.87Red-edge, NDVI, NDWI0.811.210.860.911.270.860.821.260.850.841.320.84VH, NDVI, NDWI,P,tmin, tmax0.791.170.870.781.170.870.771.190.870.771.090.89NDVI, NDWI,P,tmin, tmax0.761.180.870.761.220.860.781.200.860.711.080.89Red-edge, NDVI, NDWI,P,tmin, tmax0.761.170.870.751.180.870.771.170.870.801.190.87VH,VH/VV,RVI, Red-edge, NDVI, NDWI,P,tmin, tmax0.761.150.870.731.130.880.711.170.870.741.230.86VH,VH/VV,RVI, NDVI, NDWI,P,tmin, tmax0.741.180.870.801.170.870.771.240.860.781.250.85VH,VV,VHVV,RVI,Red-edge, NDVI, NDWI,P,tmin, tmax0.741.140.880.801.150.870.681.170.870.771.180.87VH,VV, Red-edge,NDVI, NDWI,P,tmin, tmax0.731.150.880.701.100.890.671.150.880.761.180.87VH, Red-edge,NDVI, NDWI,P,tmin, tmax0.701.070.890.811.170.870.701.160.870.791.140.88STL- FMMTL-FMSTL-RFSTL-XGBoostMAERMSER2MAERMSER2MAERMSER2MAERMSER2VH,VV,VH/VV,RVI,ap16.9122.930.6018.7923.240.5614.7720.670.6716.1122.810.60VH, RVI14.6317.910.7513.5116.160.8012.5117.850.7514.0318.810.73VH, VH/VV,RVI,H,E,alpha14.5417.460.7718.4024.020.5613.9419.630.7016.9921.720.64VH, VH/VV,RVI14.2017.360.7714.7217.890.7512.5117.840.7611.5015.840.81VH,VV,VH/VV,RVI,P,tmin, tmax13.5817.490.769.3311.300.905.978.830.947.4510.500.92VH,VV13.5617.270.7713.3415.880.8113.2019.080.7216.2722.670.60VH,VV,VH/VV,RVI13.1516.870.7812.6215.490.8212.7418.870.7313.3219.770.70VH, VH/VV12.8615.770.8113.5816.740.7812.5117.840.7614.0318.810.73VH, VH/VV,RVI,P,tmin,tmax,H,E,alpha12.2315.890.8113.0716.790.785.878.820.947.1210.000.92VH, VH/VV,RVI,P,tmin,tmax11.7114.180.8514.5517.620.766.139.130.946.739.310.93VH,VH/VV,RVI, NDVI, NDWI,P,tmin, tmax10.2212.350.8811.3014.820.845.608.690.946.458.330.95Red-edge, NDVI, NDWI9.9911.980.899.3912.810.8710.7614.790.8311.1115.820.81VH,VH/VV,RVI, Red-edge, NDVI, NDWI,P,tmin, tmax9.6612.080.8910.2613.140.875.598.560.946.779.270.93VH,VV,VH/VV,RVI,Red-edge,NDVI, NDWI,P,tmin, tmax9.6312.060.8914.4017.900.755.688.270.956.638.650.94VH,VV,NDVI,NDWI,P,tmin, tmax9.4712.610.8810.6013.870.855.417.370.966.257.990.95VH,VV, Red-edge,NDVI, NDWI,P,tmin, tmax8.4810.920.9110.6013.390.865.317.240.965.687.360.96NDVI, NDWI8.2711.670.908.5712.920.8711.3516.210.8013.6618.920.72VH, Red-edge,NDVI, NDWI,P,tmin, tmax,H,E,alpha8.1510.230.929.6113.380.865.457.560.967.479.560.93Red-edge, NDVI, NDWI,P,tmin, tmax7.299.520.937.649.440.935.697.480.965.988.050.95VH, Red-edge,NDVI, NDWI,P,tmin, tmax7.109.480.938.2510.200.925.267.440.966.388.620.94VH, NDVI, NDWI,P,tmin, tmax7.008.450.9510.5713.080.875.327.620.966.158.150.95NDVI, NDWI,P,tmin, tmax6.889.380.938.2710.300.925.827.400.966.879.300.93STL- FMMTL-FMSTL-RFSTL-XGBoost Figure C5: Performance metrics for corn Height estimation using different feature combinations and model types. Figure C6: Comparison of estimated and actual soybean VWC and height for fields selected from Iowa 2016 test dataset. 262 MAERMSER2MAERMSER2MAERMSER2MAERMSER2VH,VV,VH/VV,RVI49.7863.820.5754.6169.950.4948.4469.270.5053.4181.870.30VH,VV,VH/VV,RVI,H,E,alpha48.3659.430.6355.2964.970.5655.9769.710.4949.8772.910.44VH, RVI47.2858.830.6441.0256.560.6251.6773.650.4367.8391.460.13VH,VV,VH/VV,RVI,ap45.0761.060.6243.2654.230.6950.4967.850.5249.6575.080.41VH, VHVV43.4356.270.6752.7964.350.56751.6773.650.4467.8391.460.13VH,VV43.2256.280.6740.1655.570.6846.9564.920.5666.7682.610.29VH, VH/VV,RVI43.1558.620.6443.3456.560.6751.8173.900.4360.0887.770.20VH, Red-edge,NDVI, NDWI,P,tmin, tmax,H,E,alpha26.2232.490.8922.1527.410.9214.6123.200.9521.3640.650.83VH,VV,P,tmin,tmax,H,E,alpha26.1841.850.8325.5132.570.8916.1625.400.9321.2438.350.85VH, VH/VV,RVI,P,tmin,tmax,H,E,alpha24.3229.520.9125.0630.970.9015.4825.810.9318.8137.580.85NDVI, NDWI22.8435.420.8726.2139.070.8431.4649.630.7439.5263.390.58Red-edge, NDVI, NDWI21.9733.790.8829.1537.370.8527.8443.420.8039.5664.920.56VH,VV,VH/VV,RVI,climate20.7926.100.9321.8327.670.9214.7523.050.9419.4336.660.86VH,VH/VV,RVI, Red-edge, NDVI, NDWI,P,tmin, tmax18.6923.80.9519.5822.370.9513.3922.250.9516.0735.830.87VH,VV, Red-edge,NDVI, NDWI,P,tmin,tmax18.6623.550.9520.8127.190.9214.3722.480.9521.4241.720.82VH, VH/VV,RVI,P,tmin,tmax18.5623.530.9516.3620.480.9614.6822.890.9520.6138.670.84VH,VV,VHVV,RVI,Red-edge,NDVI, NDWI,P,tmin,tmax17.8921.860.9514.8619.600.9613.9222.090.9519.6238.140.85VH,VH/VV,RVI, NDVI, NDWI,P,tmin, tmax16.4220.880.9523.3929.370.9113.2421.130.9517.0036.080.86VH, Red-edge,NDVI, NDWI,P,tmin, tmax15.5016.450.9516.721.470.9514.7923.490.9420.0839.170.84VH,VV, NDVI, NDWI,P,tmin,tmax14.5019.190.9720.2524.360.9414.4721.710.9519.0537.200.86Red-edge, NDVI, NDWI,P,tmin, tmax14.4618.180.9718.9123.750.9414.1422.950.9422.7242.480.81NDVI, NDWI,P,tmin, tmax13.0618.070.9714.5418.320.9614.8222.620.9521.3939.670.84VH, NDVI, NDWI,P,tmin, tmax11.4014.990.9814.0918.220.9714.8722.500.9521.1438.880.84STL- FMMTL-FMSTL-RFSTL-XGBoost Figure C7: Comparison of estimated and actual soybean VWC and height for fields selected from Iowa 2016 test datase 263 CONCLUSION 264 The culmination of this dissertation underscores the transformative potential of advanced remote sensing technologies, particularly SAR, and deep learning models in enhancing agricultural monitoring and yield prediction. Through the integration of multi-temporal SAR data, analytical time-series analysis, machine learning, deep learning, and geospatial foundation models, this work demonstrates substantial improvements in the accuracy and reliability of crop attributes estimation, including planting date, yield, VWC, and crop height across diverse climatic regions and management practices. Key findings reveal that the incorporation of SAR-derived planting dates can significantly refine yield predictions, reducing biases and uncertainties in rainfed paddy fields, as evidenced in the Cambodia case study. The study demonstrated that using SAR-derived planting dates improved yield prediction accuracy by 7-48% across different provinces, with the normalized bias for rice yield being reduced significantly. The results also highlighted that differences between crop- calendar-based planting dates and SAR-derived planting dates could be as much as 75 days, emphasizing the importance of accurate planting date estimation. Moreover, the application of machine learning and deep learning techniques, particularly XGBoost and patch-based 3D-CNNs, has shown to be exceptionally effective in predicting yields with minimal error margins. Specifically, the models achieved a 7.5% margin of error in predicting yields a full month before harvest, underscoring the critical role of SAR data, especially the VH channel, in capturing essential crop features. The analysis further illustrated that XGBoost consistently outperformed other methods, particularly in scenarios with limited reference data, while patch-based 3D-CNNs closely approximated XGBoost’s performance with a more streamlined set of input features. 265 Furthermore, this research pioneers the application of self-supervised learning within geospatial foundation models to estimate VWC and crop height. The Single-Task Learning Foundation Model (STL-FM) demonstrated superior accuracy and generalization capabilities, achieving R² values of 0.90 and 0.89 for soybean and corn VWC, respectively, and 0.95 and 0.98 for crop height. The integration of SAR, optical indices, and weather data provided more reliable estimations than using individual data sources alone. Feature importance analysis identified key drivers such as NDVI, NDWI, VH backscatter, and precipitation for accurate VWC and height estimations, with the red-edge band emerging as particularly significant for VWC estimation. In conclusion, this dissertation advances the field of agricultural remote sensing by showcasing the powerful synergy between SAR data and machine learning and deep learning models. The research not only improves crop monitoring techniques, but also sets a precedent for future innovations in agricultural management. These advancements contribute to a more precise, data- driven approach to support global food security and sustainable agriculture across diverse landscapes. 266 5. FUTURE RESEARCH DIRECTIONS 267 Our study opens several avenues for future research in crop monitoring using geospatial foundation models (FMs). Future work should explore larger areas for self-supervised learning (SSL), including different climate, practice management and different crop types. In the current study, due to high computational costs, we utilized an area surrounding the reference data in Iowa and Michigan for SSL (Figure 4.1). This approach, while effective, did not encompass various field conditions or diverse climates such as Florida. Future research with expanded resources could broaden the scope to include a wider range of agricultural conditions and climatic zones, potentially enhancing the model's generalizability. While our current models primarily use C-band Sentinel-1A data with acceptable accuracy for VWC and crop height estimation without incorporation of surface roughness and soil moisture, the limited L-band data from Florida shows promise for improved canopy-level sensitivity. As illustrated in Figure 4.14i, there are notable similarities between L- and C-band VV polarizations, as well as between C-band VH and L-band cross-polarization backscatter. Future research should explore the integration of Sentinel-1A C- band SAR and the upcoming NASA-ISRO Synthetic Aperture Radar (NISAR) L-band SAR (scheduled for launch in February 2025) data with FMs to enhance crop monitoring capabilities. The selection of C-band or L-band should be informed by crop biomass characteristics, with L- band potentially offering advantages for high-biomass crops and C-band being more suitable for lower-biomass crops. This integrated approach could provide a more comprehensive and nuanced understanding of crop dynamics across various growth stages and biomass levels. The enhanced accuracy in VWC and crop height estimation offered by FMs has significant implications for precision agriculture. Future research should focus on translating these improvements into practical tools for farmers and agricultural managers, supporting more efficient resource use and potentially higher yields. 268