Introduction
In the twentieth century tomato (Solanum lycopersicum L.) has become the second most produced vegetable crop after potato (FAOSTAT 2023). Tomato belongs to the Solanaceae family and is generally considered a vegetable, although sometimes is referred to as a fruit (Foolad, Reference Foolad2007). Tomato originated from South America and was probably domesticated in Mexico. Tomato was brought to Europe by the Spanish in the 1540s and it grew particularly well in Mediterranean climates. Tomato was consumed from the early 17th century in Spain (López-Terrada, Reference López-Terrada2014). At first, tomato was treated in Europe as an ‘ornamental’ and was known as ‘golden apple,’ ‘love apple,’ or “Peruvian apple’ (Jones, Reference Jones2007). Tomato was imported in Greece in the eighteenth century. Currently, Greece is ranked sixth in total tomato production in the European Union, after Spain, Italy, Portugal, Poland and Netherlands, with 853.2 thousand tonnes (European Commission - DG Agri G2, 2021). In addition to that, Crete is the leading region of Greece for greenhouse production, followed by Peloponnese, Macedonia, Thessaly, Central Greece, Epirus and Aegean Islands (Savvas et al., Reference Savvas, Ropokis, Ntatsi and Kittas2016). To meet the increasing demand for tomatoes there is a need for breeding to enhance yield and improve tolerance to various stresses.
Breeding efficiency in tomato has been improved by using molecular markers to identify and transfer important alleles from germplasm to elite cultivars (Foolad, Reference Foolad2007). However, there is a lack of sufficient polymorphic markers between closely related tomato species and within cultivars because the majority of molecular markers were developed based on polymorphisms between domesticated tomato and its wild relatives (Tanksley et al., Reference Tanksley, Ganal, Prince, Carmen de-Vicente, Bonierbale, Broun, Fulton, Giovannoni, Grandillo, Martin, Messeguer, Miller, Miller, Paterson, Pineda, Roder, Wing, Wu and Young1992; Fulton et al., Reference Fulton, Nelson and Tanksley1997; Frary et al., Reference Frary, Xu, Liu, Mitchell, Tedeschi and Tanksley2005; Sadiyah et al., Reference Sadiyah, Ashari, Waluyo and Soegianto2020). Simple sequence repeat (SSR) markers are often the preferred molecular markers for marker-assisted plant breeding when they are available because the SSR markers possess properties suitable for high-throughput genotyping, such as high reproducibility, co-dominance nature, multi-allelic variation, simplistic assay, low distributing cost and easy automation (Edwards and McCouch, Reference Edwards and McCouch2007; Vieira et al., Reference Vieira, Santini, Diniz and de Freitas Munhoz2016; Bhattarai et al., Reference Bhattarai, Shi, Kandel, Solís-Gracia, da Silva and Avila2021).
In recent years the majority of available tomato seeds are hybrids which are costly and sometimes lack qualities, such us flavour and aroma. Hybrid seed also requires that growers buy their seeds every year to achieve consistent production. (Mavromatis et al., Reference Mavromatis, Athanasouli, Vellios, Khah, Georgiadou, Pavli and Arvanitoyannis2013). In contrast to modern-day hybrids, tomato landraces are very heterogeneous because they have been selected for their performance in adverse and low-input agricultural environments, as well as qualitative criteria e.g. aroma and flavour. Tomato is mainly propagated with seeds which enable the genetic material to be saved through the years. Many efforts have been made individually or organized to collect seeds from different vegetable crops including tomato. These seeds are either inferred varieties, landraces or populations of vegetables, and they are collected, preserved and cultivated organically by farmers.
Databases are a valuable tool in the cataloguing and analysis of biological specimens. Biological databases provide a deep knowledge store for scientists to preserve genomic and phenotypic data (Whitehornand and Marklyn, Reference Whitehornand and Marklyn2001). Databases also enable improved understanding of the relationships between available data and gain new insights into the relationships between traits to enable improved decision-making or identifying new opportunities (Codd, Reference Codd and Rustin1971). These type of databases for other plants exists. For instance, Vitis vinefera L. a plant that has been widely investigated has multiple available databases: a European Vitis Database ((http://www.eu-vitis.de/index.php), an Italian ((https://vitisdb.it), a Swiss (http://www1.unine.ch/svmd), a French Vitis Database (http://plantgrape.plantnet-project.org/it/cepages) and a Greek Vitis Database (http://www.biology.uoc.gr/gvd). Databases also exist regarding spinach (http://spinachbase.org), olive http://www.bioinfo-cbs.org/ogdd/) and rice (http://server.malab.cn/Ricyer/index.html). Databases for Solanum spp. include the Tomato Functional Genomics Database (TFGD) (http://ted.bti.cornell.edu/), and the Kazusa Tomato Genomics Database (https://www.kazusa.or.jp/tomato/) are some of the tomato database that exist.
Even though tomato is not native to Greece, a significant effort has been devoted to documenting and understanding the wide range of phenotypic and genetic variability in Greek tomato cultivars using molecular markers (Terzopoulos and Bebeli, Reference Terzopoulos and Bebeli2008; Gonias et al., Reference Gonias, Ganopoulos, Mellidou, Bibi, Kalivas, Mylona, Osanthanunkul, Tsaftaris, Madesis and Doulis2019; Athinodorou et al., Reference Athinodorou, Foukas, Tsaniklidis, Kotsiras, Antonios, Costas, Kyratzis, Tzortzakis and Nikoloudakis2021). However, even though Greece is considered an important producer, the data from previous studies is not widely available. The objective of this study was to perform molecular fingerprinting with SSR markers of 27 landraces and the two hybrids cultivated in Crete, Greece. A population structuring analysis was performed to delve deeper into the ancestry of these ecotypes, hypothesizing that there is going to be a diverse and unknown genetic material A publicly available relational database was created, containing all possible information about tomato genotypes cultivated in Greece, including Greek and ISO transliterated name, place of origin/location, genetic/molecular characteristics.
Materials and methods
Plant material
In 2019 and 2020, in collaboration with a nonprofit cooperative enterprise named ‘Melitakes’, seeds from 27 landraces of tomato (S. lycopersicum L.) were collected for this study (Table 1.), a total of 83 individuals. ‘Melitakes’, is a Social Cooperative Enterprise located in Pyrgos, Heraklion, Crete, Greece, which deals with the collection of seeds from different vegetable crops including tomato seeds. In addition, in collaboration with the Department of Agriculture of the Mediterranean University of Crete, seeds from six tomato landraces and two hybrids cultivated in Crete were collected. The landrace name in Greek along with the two hybrid names, the code name used in the analysis, and the origin of the seeds are shown in Table 1.
In italics is the origin of the seeds for 83 individuals of the 27 tomato landraces and the two tomato hybrids.
Fourteen landraces arrived in the greenhouse of the Department of Biology, University of Crete, and the remaining thirteen landraces and two hybrids were planted in a greenhouse of the farm of the Department of Agriculture of the Hellenic Mediterranean University of Crete. In both cases, seeds were germinated in a controlled environment nursery, in trays filled with peat moss and perlite. When the plants reached the 4th main stem leaf were transplanted in 2 lt pots filled with peat moss with perlite in a ratio of 3:1. The plants were watered every three days, and a well-balanced fertilizer was used (Nitrophoska® 15-5-20 (+ 2MgO + 8S + TE), EuroChem, Greece). Young leaves were collected from all genotypes and were either stored at −80 °C or used directly for DNA extraction (Table 1).
DNA extraction
DNA was extracted using a CTAB-based protocol according to (Gonias et al., Reference Gonias, Ganopoulos, Mellidou, Bibi, Kalivas, Mylona, Osanthanunkul, Tsaftaris, Madesis and Doulis2019) and (Bibi et al., Reference Bibi, Gonias and Doulis2020). The leaf tissue (0.05 g) was ground with liquid nitrogen with a mortar and pestle. A CTAB buffer was prepared with 20.5 g NaCl (Sigma Aldrich by Merck KGaA, Darmstadt, Germany), 5 g CTAB (Sigma Aldrich by Merck KGaA, Darmstadt, Germany) in 215 mL ddH20, 25 mL 1 M TRIS-HCL pΗ:8 (Sigma Aldrich by Merck KGaA, Darmstadt, Germany), 10 mL 0.5 M EDTA pΗ:8, 1 μl b- mercaptoethanol (Sigma Aldrich by Merck KGaA, Darmstadt, Germany) and 1% (w/v) PVP 360 (Sigma Aldrich by Merck KGaA, Darmstadt, Germany of CTAB Buffer (0.5 ml) was added to the 50 mg (0.05 g) of ground tissue and incubated at 65 °C for 30 min with occasional vigorous shaking. Chloroform: Ιsoamyl alcohol (24:1) (Sigma Aldrich by Merck KGaA, Darmstadt, Germany) 0.5 mL was added to each sample and shaken using a vortex. The samples were centrifuged at 13,000 rpm for 15 min to resolve phases. The aqueous phase was pipetted out carefully to a fresh tube (1.5 mL), added 0.5 mL cold isopropanol, mixed and incubated at −20 °C or at 4 °C or on ice for 1 h or overnight. The samples were centrifuged at 9000 rpm for 5 min. The supernatant was discarded and the pellet was washed in 70% ethanol. 70% ethanol (1 ml) was added and the samples were centrifuged at 13,000 rpm for 5 min the ethanol was carefully removed and the pellet dried for at least an hour. The precipitate was dissolved in 100 μl TE buffer by gentle inversion for at least an hour. RNase A (Qiagen, by Safe Blood Bio Analytical, Greece) (10 mg/mL) 1 μl was added and the sample was incubated at 37 °C for 30 min (wait for at least 15 min). Proteinase K (F. Hoffmann-La Roche Ltd) (1 mg/mL) 1μl was added and again the samples were incubated at 37 °C for 15–30 min. 3 M Sodium acetate (Sigma Aldrich by Merck KGaA, Darmstadt, Germany) (10 μl)and 250 μl absolute ethanol was added. The samples were incubated at −20 °C or at 4 °C or on ice for 1 h or overnight. The samples were centrifuged at 14.000 rpm for 10 min and the supernatant was discarded. Following that 1000 μl of 70% ethanol was added and then centrifuged at 13,000 rpm for 5 min, empty the ethanol carefully and drain dry (let it stay for at least an hour). In the final step, the DNA pellets were suspended in 100 μl 1X Tris-EDTA (TE) buffer (Serva by TechnoBioChem Ltd, Greece) and stored at −20 °C. The DNA quality was visualized on 0.8% agarose (Invitrogen™ by ThermoFischer Scientific) gels stained with ethidium bromide (Sigma Aldrich) (10 mg/mL), and the samples were then diluted in TE to a final concentration of 10 ng/μl.
SSR analysis
A set of 11 SSR markers scattered throughout the genome of S. lycopersicum L, was selected, providing a high number of alleles from He et al. (Reference He, Poysa and Yu2003) and Korir et al. (Reference Korir, Diao, Tao, Li, Kayesh, Li, Zhen and Wang2014) (Table 2). Samples were genotyped with eleven selected microsatellite molecular markers. PCR was conducted in a final volume of 10 μl and the reaction mixture contained 1 ng/μl genomic DNA, 0.2 μM of the forward primer (labelled) and 0.2 mM of the reverse primer (unlabelled), 0.2 mM dNTP, 1U Taq DNA Polymerase (New England Biolabs, Ipswich, MA, USA), 1X Taq polymerase Mg-free buffer, and 2 mM MgCl2. The forward primers were labelled with ABI fluorescent dyes HEX (green), ROX (red), and FAM (blue) (Eurofins Genomics). Amplifications were performed using a T100 thermal cycler (Bio-Rad Laboratories Inc., United Kingdom). The amplification conditions consisted of an initial denaturation step of 5 min at 95 °C, followed by 30 cycles of 30 s at 94 °C, 30 s at 50–62 °C and 30 s at 72 °C, with a final extension at 72 °C for 10 min. The resulting PCR products were first visualized by 0.8% agarose gel electrophoresis. Up to three different primer pairs were mixed in the same well (multiplex), taking into account the size of the amplified fragments and/or the labelling of the primers prior to the SSR fractionation. The products were loaded into the SEQ-Studio genetic analyser (Applied Biosystems, Foster City, California, USA) for SSR fractionation. During the fragment analysis, size standards LIZ600 of Applied Biosystems were employed. Allele binning and data matrix production was done within STR and, version 2.4.108 (Veterinary Genetics Lab, University of California).
Genetic analysis and neighbour-joining tree construction
For each locus, allele sizing was based on published repeat patterns (Carvalho et al., Reference Carvalho, Yadav, Garrido-Maestu, Azinheiro, Trujillo, Barros-Velázquez and Prado2021). The data matrixes were produced and genetic diversity measures were determined for each employed locus across all fingerprinted genotypes. These measures included: (i) individual locus polymorphic information content (PIC) (Botstein et al., Reference Botstein, White, Skolnick and Davis1980), (ii) observed heterozygosity (HO) and (iii) expected heterozygosity (HE) to determine the genetic variation. PIC, HO, HE, estimated frequency of null alleles, and probability of identity (PI) were calculated with the software CERVUS ver. 3.0.3 software package (Kalinowski et al., Reference Kalinowski, Taper and Marshall2007). A similarity matrix was produced employing Nei's distance matrix within GenAlEx version 6; (Peakall and Smouse, Reference Peakall and Smouse2012). Subsequently, a neighbor-joining tree was produced using the function about of the popp package of R (v. 4.1.3) to estimate the dendrogram based on Nei's genetic distance together with the bootstrap values on the branches of the tree. From the 83 tomato genotypes, 27 tomato landraces and two hybrids were used for dendrogram construction. To estimate the divergence between the different populations, pairwise Fst measurements were calculated according to (Weir and Cockerham, Reference Weir and Cockerham1984) using GenAlEx 6 (Peakall and Smouse, Reference Peakall and Smouse2006). Analysis of molecular variance (AMOVA) was also performed to assess the genetic structure of the 27 tomato landraces and two tomato hybrids, using GenAlEx 6.
Population structure
The genetic structures of these individuals were analysed using STRUCTURE 2.3.4 software (Pritchard et al., Reference Pritchard, Stephens and Donnelly2000). This software applies a Bayesian clustering algorithm to identify subpopulations, assign individuals to them, and estimate population allele frequencies (Pritchard et al., Reference Pritchard, Stephens and Donnelly2000). This analysis was carried out using a burning period of 200,000 iterations and a run length of 800,000 MCMC replications. We tested a continuous series of K, from 1 to 10, in 10 independent runs. We did not introduce any prior knowledge about the origin of the population and assumed correlated allele frequencies and admixture (Falush et al., Reference Falush, Stephens and Pritchard2003). For selecting the optimal value of K, ΔK values (Evanno et al., Reference Evanno, Regnaut and Goudet2005) were calculated using STRUCTURE harvester (Earl and vonHoldt, Reference Earl and vonHoldt2012). POPHELPER, proposed by (Francis, Reference Francis2017) was used to analyse and visualize population structure.
Multidimensional scaling analysis
Multidimensional Scaling (MDS) is a computational approach used to visualize the level of similarities (or dissimilarities) between high-dimensional individuals as a configuration of points mapped into a Cartesian space (Mead, Reference Mead1992). MDS is a distance-based method. Here, we applied the Rs distance (Reynolds, et al., Reference Reynolds, Weir and Cockerham1983) between the populations of the sample. Reynolds distance (or coancestry distance) provides an estimate of the genetic drift between the populations. MDS and the Reynolds distance were calculated using the R programming language (v. 4.1.3) and the packages poppr and adegenet. Since distances were calculated between populations (and not between individuals), we used the function genind2genpop from the adegenet R package to convert individual genotype data into alleles counts per population.
The Greek Tomato Database
The data were converted to CSV files and imported into the database via the utility phpMyAdmin (www.phpmyadmin.net, version 5.1.0) which has been configured and used as the main tool for the data management. The platform was deployed to a new server and the content is served via an Apache web server (httpd.apache.org, version 2.4.41) using a Linux distribution of Ubuntu version 20.04 LTS as its operating system. The web hosting server also has a Linux Server distribution of Ubuntu Server (ubuntu.com, version 20.04.01 LTS). For the development of the website, the Laravel PHP Framework version 7.30.4 was used. The content of the website is served with the web scripting language PHP version 7.4.3. The database is stored under the MySQL server version 8.0.28.
This database is hosted on an Apache operating system (http://d.apache.org, version 2.4.41). All work was done in a PC with Linux distribution of Ubuntu as its operating system (ubuntu.com, version 20.04 LTS), and is supported by the open-source MySQL relational database. The server is located in the facilities of FORTH-IMBB in a specially configured Data Centre. The Database can be accessed at http://139.91.75.96/tomatodb using a simple web browser. The user has access to information about samples such as sample collection sites. This database provides the users the ability to store SSR results and information about SSR primers. In addition, there are photos for each sample of the same plant, leaf, foliage, and fruit as well as a cross-section of the fruit.
Our two main tasks were firstly to create the SQL structure of each database table based on the fields found at the header of the flat text file and secondly to import the data from the old format to the new one, therefore each flat file was converted as a separate database table within the same database. The conversion of the data from the flat file format to the new database was done in the next steps. Based on these fields and the type of data it holds we decided which type of data to use for each field. Then with SQL, we created an SQL query for each table. This SQL query was imported into the phpMyAdmin system and created the tables.
Results
Genetic diversity analysis of indigenous landraces of tomato
A total of 83 individuals were genotyped, using 11 SSR loci, to amplify polymorphic fragments from the 27 tomato landraces and two hybrids. All 11 SSR loci were used efficiently and reproducibly. The genetic diversity measures determined for each employed locus across, all fingerprinted genotypes are shown in Table 3. The number of amplified alleles (k) by each SSR primer pair varied from eight for SLR5 to fourteen for SLR10 with an average number of alleles per locus of 10.727. The expected heterozygosity (He) ranged from 0.440 in SLR3 to 0.813 in SLR26 with an average value of 0.6172. High heterozygosity means lots of genetic variability, on the contrary, low heterozygosity means little genetic variability (Mc Donald, Reference Mc Donald2018). The genotype level of polymorphism was assessed by calculating PIC values for each of the 11 SSR loci. In our study, the average PIC (polymorphic information content) was 0.5687. Using the 11 SSR markers in combination, the cumulative probability of identity, a measure of the probability of obtaining an identical genotype was calculated, with a value of 8.41 × 10−9. Similarly, the use of SSR markers in tomato genotypes has given data with a very low cumulative probability, which has been shown before in other studies (Laucou et al., Reference Laucou, Lacombe, Dechesne, Siret, Bruno, Dessup, Dessup, Ortigosa, Parra, Roux, Santoni, Varès, Péros, Boursiquot and This2011; Emanuelli et al., Reference Emanuelli, Lorenzi, Grzeskowiak, Catalano, Stefanini, Troggio, Myles, Martinez-Zapater, Zyprian, Moreira and Stella Grando2013; Doulati-Baneh et al., Reference Doulati-Baneh, Mohammadi, Labra, De Mattia, Bruni, Mezzasalma and Abdollahi2015). This number corresponds to a statistical potential of distinguishing a large number of unrelated tomato genotypes. AMOVA was conducted to determine the variation explained by populations. The results indicated that 58% of the genetic variation (P < 0.0001) resided among populations and 42% (P < 0.0001) resided within populations.
Number of observed alleles (k), Expected (He) and observed (Ho) heterozygosity polymorphic information content (PIC), Probability of null alleles (F(Null)).
STRUCTURE analysis
The genetic structure of the whole population was evaluated using STRUCTURE software. The analysis provided evidence for a significant population structure in this set of cultivars. A maximum value of the rate of change in the log probability of the data was revealed at K = 9, using Evanno's method (Fig. 1a). The highest Delta K value was observed at K = 9 (Fig. 1d). The estimated logarithm of the probability of the data [L(K)] increased linearly from K = 3 up to K 7 showing a clear point of inflection (Fig. 1c).
The estimated population structure inferred from the analysis identifies nine genetic groups (ancestor populations -a.p A, B, C, D, F, G, H and I) and is graphically presented in Fig. 1a, supporting the hypothesis that there is a diverse genetic material. All individuals were assigned to nine ancestor populations revealing interesting pairing that is analysed below along with the dendrogram produced (Fig. 1b).
Genetic distance analysis
A neighbour-joining tree was built based on Nei's distance matrix (Fig. 2), where shorter branches between two landraces/hybrids, indicate higher genetic similarity between them. Therefore based on the phylogenetic tree, three main clusters were revealed, that include individuals from a variety of ancestor populations (Fig. 2a), no previous records have been reported with this group of Greek landraces. The first one (Fig. 2a, cluster A, branch colour is black) includes the landraces VEG11HMU600, Veg10K10C in a smaller cluster, Santorini K40-19, PrunenoirK38A and Bournalati K36, KerasMavI19 K37E and KerasMavFORTH K37E clustered together, Veg15 Belladona F1, Veg16 Bobcat F1 and Veg3 HMU3 clustered together, Veg1 HMU220, Bournelati K10 and Bournalati K34, finally Veg 7 K21, Veg 12 HMU1120, Veg6 K2A, Veg2 HMU2040 clustered together. According to the dendrogram Veg10 K10C and VEG11HMU600 are mingled forming a cluster with a bootstrap value 57 (Fig. 2a). The Structure analysis revealed that both of them have individuals that belong to ancestor populations A and F (yellow and dark blue) (Fig. 1b). The similarity of the genotypes is also depicted in the Multidimensional scaling (MDS) diagram (Fig. 2b). Santorini K40-19 has a very distinct position on the tree and the Structure analysis revealed that this tomato landrace created an ancestor population of its own (ancestor population D: raff-blue). In the MDS plot Santorini K40-19 also showed its distinct position (Fig. 2b). Among these landraces in the main cluster, PrunenoirK38A and Bournelati K36 are mingled with a low bootstrap value of 30 (Fig. 2a). The Structure analysis revealed that Bournelati K36 belongs to the ancestor population A (dark blue) and PrunenoirK38A has individuals that belong in ancestor populations A and F (yellow and dark blue) (Fig. 1b). The MDS diagram places all landraces that belong in the ancestor populations A and F (yellow and dark blue) close by (Fig. 2b). Furthermore, KerasMavI19 K37E and KerasMavFORTH K37E are mingled together with a high bootstrap value of 96 increasing the certainty for this cluster formation. The Structure analysis also shows that these landraces belong to the same ancestor population I (red, Fig. 1b), and in the MDS diagram they are very close to each other (Fig. 2b). Interestingly, the hybrid Veg16 Bobcat F1 and Veg3 HMY3 are mingled together (bootstrap value 62), and are close enough with Veg15 Belladona F1 (bootstrap value 55). The Structure analysis also shows that Veg16BobcatF1 and Veg3HMY3 belong to the same ancestor population C (light blue, Fig. 1b) and in the MDS diagram they are very close to each other (Fig. 2b). Veg15 Belladona F1 has individuals that belong in ancestor populations C, E, and F (light blue, yellow and green) (Fig. 1b). Bournalati K10 and Bournelati K34 are clustered together and close by is Veg1 HMU220. Bournalati K10 and Veg1 HMU220 belong to the same ancestor population G (dark yellow) while Bournelati K34 belongs to ancestor population B (blue|) (Fig. 1b). All of them appear very close in the MDS diagram (Fig. 2b). Finally, a cluster is formed from Veg 7 K21, Veg 12 HMU1120, Veg6 K2A and Veg2 HMU2040. Veg6 K2A and Veg2 HMU2040 are closer to each other (bootstrap value 46) in the MDS diagram, while the rest are further apart (Fig. 2b). The Structure analysis showed that among the Veg 7 K21, Veg6 K2A and Veg2 HMU2040 belong to the same ancestor population B (blue) and Veg 12 HMU1120 has individuals that belong in ancestor populations B and G (blue and dark yellow) (Fig. 1b).
In the second main cluster of the dendrogram (Fig. 2, cluster B, branch colour is red), Agrokipiou K301 and Agrokipiou K302 are mingled together and appear identical (bootstrap value 94). According to the Structure analysis, these landraces belong to the same ancestor population E (green, Fig. 1b, 2a). Close enough is Kipourou K10B (bootstrap value 49, ancestor population E). Also, another cluster is formed between Agrokipio K30 and VoidokardiaB k37A but they are not genetically linked. The Structure analysis shows that Agrokipio K30 has individuals that belong to the ancestor populations E and F (green and yellow) (Fig. 1b, 2a) and VoidokardiaB k37A has individuals that belong to the ancestor populations E and G (green and dark yellow) (Fig. 1b, 2a). In the MDS diagram, these landraces are not very close therefore for these landraces, we cannot conclude. (Fig. 2b).
Finally, in the third main cluster (Fig. 2, cluster C, branch colour is green) of the diagram five landraces are found. These are Bournelati K35, KerasKokFORTH K37D, Veg8 K37Z, Veg9 HMU9, and Veg13 HMU13. Bournelati K35 and KerasKokFORTH K37D are mingled together (bootstrap value 46) and according to the Structure analysis, Bournelati K35 belongs to the ancestor population F (yellow) and KerasKokFORTH K37D has individuals that belong to the ancestor populations F and A (yellow-dark blue) (Fig. 1b, 2a). In addition to that, Veg9 HMU9, Veg8 K37Z and Veg13 HMU13 are mingled together and according to the Structure analysis, these landraces belong to the same ancestor population H (orange, Fig. 1b, 2a). Also, these genotypes are forming a cluster with Veg13 HMU13. The Structure analysis revealed that Veg13 HMU13 belongs to the ancestor population H (orange, Fig. 1b, 2a). In the MDS diagram these landraces are very close to each other (Fig. 2b).
The database schema
A MySQL server available at the Institute of Molecular Biology and Biotechnology was used to create the structure of the tables needed for each type of data and the relations among them (Fig. 3a, b). An open-source web application in PHP and HTML5 was also developed for data presentation. For data management, a MySQL data management platform has been installed and configured as open-source PHPMyAdmin (www.phpmyadmin.net, version 5.1.0). The schemas of the new database are shown in Fig 3a. The GTD database enables readers to retrieve these data in one or more tables with a single query. GTD also provides an insight into the protocols used for this research and some basic phenotypic information including the place of origin for its landrace. GTD can be updated and enriched by users with new and supplemental information to existing datasets. An overview of the GTD indicating primary functions is presented in Supplementary material.
Discussion
It was hypothesized that in Greece and especially in Crete, the leading region for greenhouse tomato production, there is going to be a very diverse genetic pool, that even though has been narrowed with natural selection through the years, the landraces still remain diverse. Eleven SSR loci were used to amplify polymorphic fragments from these genotypes and the probability of obtaining an identical genotype was low enough 8.41 × 10−9, corresponding to a statistical potential of distinguishing a large number of unrelated tomato genotypes. According to the literature the use of SSR markers in tomato genotypes has given data with a very low cumulative probability of identity in many studies (Laucou et al., Reference Laucou, Lacombe, Dechesne, Siret, Bruno, Dessup, Dessup, Ortigosa, Parra, Roux, Santoni, Varès, Péros, Boursiquot and This2011; Emanuelli et al., Reference Emanuelli, Lorenzi, Grzeskowiak, Catalano, Stefanini, Troggio, Myles, Martinez-Zapater, Zyprian, Moreira and Stella Grando2013; Doulati-Baneh et al., Reference Doulati-Baneh, Mohammadi, Labra, De Mattia, Bruni, Mezzasalma and Abdollahi2015; Gonias et al., Reference Gonias, Ganopoulos, Mellidou, Bibi, Kalivas, Mylona, Osanthanunkul, Tsaftaris, Madesis and Doulis2019). The results supported the hypothesis revealing that the 27 landraces and the two hybrids belong to nine ancestral populations (ancestry groups A, B, C, D, F, G, H and I).
There is an increasing demand to clarify the relationships among local tomato landraces. The phylogenetic tree and the Multidimensional scaling (MDS) diagram, supported the great genetic diversity within our set of genotypes. Reynolds approach is considered appropriate for data with small mutation rates adopting the infinite alleles model. Even though the method was developed for allozymes, which are characterized by a small mutation rate compared to the microsatellite data, the method is considered appropriate for small populations, in which genetic drift considerably affects the evolution of the populations. In such scenarios (appropriate for our data as well), the SSR mutations do not show the bell-like distribution expected by the stepwise mutation model. In contrast, allelic distribution is similar to the infinite alleles model (Reynolds et al., Reference Reynolds, Weir and Cockerham1983).
The Structure analysis indicated that nine ancestral populations are hidden inside the group of all the genotypes tested, using Evanno's method. Notably, the majority of the landraces are genetically apart compared to the two hybrids except for two landraces (i.e. Veg3 is identical to Veg16Bobcat and Kipourou K10B is similar to Veg15 Belladona F1). These two hybrids (Veg15Belladona F1 and Veg16Bobcat F1) were also included in a genetic analysis by (Gonias et al., Reference Gonias, Ganopoulos, Mellidou, Bibi, Kalivas, Mylona, Osanthanunkul, Tsaftaris, Madesis and Doulis2019) and, similarly to our findings, the two hybrids had been found close by to the dendrogram and accessed to the same ancestor population.
These data give an insight into the relationships among the 27 tomato landraces cultivated in Crete, Greece, and provide a speculation of their origin. Structure analysis and the MDS diagram access the landraces on different ancestor populations, allowing us to delve deeper into their origin. There is no previous research regarding the ancestry of these 27 tomato landraces, however, the two hybrids seem to support previous findings regarding their origin (Gonias et al., Reference Gonias, Ganopoulos, Mellidou, Bibi, Kalivas, Mylona, Osanthanunkul, Tsaftaris, Madesis and Doulis2019).
It is a necessity to have a publicly available tomato database in Greece. The database developed is the first Greek Tomato Database that is fully functional and publicly available. It is well known that is a valuable tool in cataloguing and analysing of biological specimens and provides a deep knowledge store for scientists to preserve genomic and phenotypic data (Whitehornand and Marklyn, Reference Whitehornand and Marklyn2001). This database will serve as a depository for the molecular fingerprint of 27 tomato ecotypes allowing the characterization, identification, and preservation of these tomato ecotypes in Greece. Also, it is a starting point to incorporate new genetic information for answers on many issues such as identification, sanitation, adaptation etc.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S147926212300103X.
Acknowledgements
All authors would like to acknowledge Stella Hatzigeorgiou from the Hellenic Mediterranean Organization ELGO –DIMITRA Institute of Olive, Subtropical crops and Vitis, and Manoli Vardaki, from ‘Melitakes’, a Social Cooperative Enterprise located in Pyrgos, Heraklion, Crete, Greece. Their significant contribution was to provide us with the seeds from most of the landraces analysed.
Authors' contributions
Study conception and design: ACB, DK. Acquisition of data: ACB, AK, JM, KK. Analysis and interpretation of data: ACB, JM, PP. Drafting of manuscript: ACB, KK, PP. Critical revision: ACB, KK, EG. All authors read and approved the final manuscript.
Funding statement
This work was supported by the Emblematic Action of the Greek General Secretariat for Research and Technology, Agro4Crete (Protocol Number: SAE 013), Emblematic research action for the agri-food sector of Crete: four institutes, four reference points (Agro4Crete). This action is incorporated in Subproject 2, Intervention B” Pilotic application of new innovations in agriculture production” for the project ‘National Emblematic research action for the utilization of new technologies in the agri-food’ and it will be accomplished via a collaboration of four Institutes: Hellenic Mediterranean University, University of Crete, Foundation of Research and Technology, Institute of Molecular Biology and Biotechnology, and Hellenic Mediterranean Organization ELGO-DIMITRA.
Competing interest
None.
Declarations
Ethics approval and consent to participate (kindly mention the name of the Ethics Committee and the Ethical Approval Number)
Not applicable.
Consent for publication
All authors have approved the submission.
Availability of data and materials
The datasets generated during and/or analysed during the current study are available in the GreekTomato Database http://139.91.75.96/tomatodb.