Materials and Methods
Factors influencing quality
Results and Discussion
Characteristics of fermentation temperature
Results of MLP model
Conflict of Interest
Makgeolli, is a traditional Korean fermented rice wine with an alcohol content in the range of 2 to 8%, also referred to as “tak-ju” because of its thick texture (Kang et al., 2014). Makgeolli is brewed from rice (mixed with nuruk and yeast) in the processes of saccharification and alcohol fermentation (Jung et al., 2014). It is considered to contain healthy ingredients, such as proteins, sugars, vitamins, essential amino acids, glutamic acid, proline, glutathione, and others (Kim et al., 2007; Park and Lee, 2002).
The qualities of makgeolli are determined mainly according to the general quality characteristics of wines, such as its alcohol content, pH, total acid, volatile acid, and total sugar content, as well as minor components, such as organic acids, free sugar, and aroma composition. These quality factors can vary greatly depending on the raw materials that contain starch, fermentation conditions, storage conditions, and the types of fermentation agents) used (e.g., nuruk or yeast). Therefore, the quality characteristics constitute the most popular subject of research related to makgeolli, and many researchers have conducted research studies using various materials (Lee et al., 1996a; Lee et al., 1996b; Lee et al., 2013) and various amounts of water (Son et al., 2011), nuruk, and yeast (Han et al., 1997a; Han et al., 1997b, Lee et al., 2007; Lee et al., 2010; Park and Lee, 2002), production conditions (Kim et al., 2012; Lee et al., 2009; Yang and Lee, 1996), and storage conditions (Ji and Chung, 2012; Min et al., 2011; Seo et al., 2015; Yang and Lee, 1996).
To determine the quality characteristics of wines, high-performance liquid chromatography (HPLC) is a possible tool of choice. However, it is rarely used in actual production because of its high cost. In addition to HPLC, Fourier transform near-infrared spectroscopy has been applied to monitor the quality of makgeolli (Kim and Cho, 2015), while gas chromatography-mass spectrometry (GC-MS) and the E-Tongue technique have been used to analyze its ingredients (Kang et al., 2014; Seo et al., 2016).
Among the quality characteristics of makgeolli, the alcohol content is the most important factor, since it can affect the preservability and flavor (Kang et al., 2014). Alcohol is produced during the fermentation process by yeast and micro-organisms, and its content increases as the fermentation proceeds under the influence of temperature. According to Baek et al. (2013), the alcohol content was approximately 2% higher for fermentation at 30°C than for that at 20°C. The alcohol concentration is usually measured with an alcoholmeter and adjusted according to the Gay-Lussac table (Lee et al., 2010; Park and Lee, 2002; Son et al., 2011). However, because the equipment needed to measure alcohol concentration is very expensive, a cheaper alternative would be highly beneficial to all wine producers.
The objective of this study is to develop an alcohol concentration model of makgeolli which could be used to a makgeolli production monitoring system. The model was developed using MLP written in the Python programming language. Data were collected from a domestic manufacturer during the period of a year. Independent variables were the temperatures of the fermentation tanks and the room where the tanks were located, as well as the quantity, acidity, and water concentration of the source. The coefficient of determinations R2 of the best model with the training and the test sets were 0.94 and 0.93, respectively.
Materials and Methods
Factors influencing quality
One of the main factors influencing the quality of makgeolli is the raw materials used in its production. Makgeolli is produced mainly from rice, but wheat flour, barley, corn, and sweet potato, are also used. Depending on the type of raw materials used for the production of makgeolli, the composition of various compounds (e.g., proteins, organic, and fatty acids) will be different, and the substrate compatibility of the yeast or micro-organisms also changes. Likewise, the chemical and sensory components (e.g., taste and aroma) of the makgeolli product change according to the types of raw materials used, and this affects the final quality. Many researchers have studied the quality characteristics of makgeolli using various raw materials (Kim et al., 2008; Lee et al., 1996a; Lee et al., 1996b; Y. Lee et al., 2013).
Nuruk, a fermentation source that provides the enzymes necessary to convert starch (the main component in all thei raw materials of makgeolli) into sugar, is classified into traditional and modified types. Traditional nuruk is produced in culture with micro-organisms existing in nature, whereas modified nuruk is prepared by inoculating a sterilized starch-containing raw material with a pure bacterial culture, such as Aspergillus kawachii and Aspergillus oryzae. Because traditional nuruk differs in accordance to the composition of the various strains that grow in it, it can be produced in various forms depending on the manufacturing region and method. Conversely, modified nuruk ensures safe fermentation of the seed mash and the prevention of bacterial contamination so that a drink is produced with uniform quality. Therefore, depending on the type of nuruk used, the organic acid productivity, alcohol fermentability, and microbial enzyme activity will differ. Thus, the type of nuruk used has a great influence on the quality characteristics of the makgeolli product (Han et al., 1997a; Han et al., 1997b; Park and Lee, 2002).
Depending on the ratio of water added during the makgeolli production, the alcohol concentration and its quality can also be affected. Most makgeolli makers first produce it with high-alcohol content using a proportion of water that is less than twice the weight of dried rice, and they later add water to lower the alcohol level (Son et al., 2011).
The fermentation environment (temperature, pH, etc.) also affects the activities of the saccharification enzyme and yeast, thus changing the alcohol content and organic acid composition, or content, which may ultimately affect the overall quality of the makgeolli product. The most suitable temperature range inside the fermenter is 22 to 28°C. If the fermentation temperature is too high, the yeast will age quickly and will not ferment sufficiently. In contrast, if the fermentation temperature is too low, the yeast activity slows down and it ferments slowly. If the internal temperature of the fermenter rises above 32°C, the yeast is destroyed and immediately enters the acetic acid fermentation stage, where it becomes sour and is decomposed by spoilage bacteria. In one study, only 7 days of fermentation at 25°C were required to produce 11% alcohol, whereas three weeks were required to produce the same amount of alcohol at 15°C (Kim et al., 2012).
The alcohol content was the highest and equaled 16.2% when nuruk brewing occurred at 30°C in the presence of total nitrogen (TN). At 20°C, TN produced 14.1% alcohol. The alcohol content was therefore approximately 2% higher for fermentation at 30°C than at 20°C (Baek et al., 2013).
In this study, experimental data was obtained from a real manufacturing environment. Woorisool Co. Ltd. is one of the biggest makgeolli breweries in Korea and supported the experiments. Therefore, the fermentation process of Woorisool Co. Ltd. was used herein. Each process used 3-4 fermentation tanks with same source.
The process consisted of two fermentation stages. The first fermentation stage in which nuruk was grown lasted two days and required a small amount of water. The second fermentation stage, which constitutes the main process to make makgeolli, lasted for five days. A higher temperature is maintained for three days, and a lower temperature is maintained for the last two days. The period can be adjusted to change the alcohol concentration of the product. If the alcohol concentration is not adequately high, the temperature of the tank can be kept at increased levels.
Data for learning was obtained during the period of a year. As shown in Figure 1, Woorisool Co. Ltd. has a temperature control system. Each fermentation tank has its own sensor and two bent pipes to control the temperature. The control system records the temperatures of the fermentation tanks and the room where the tanks are located every minute. Also, the control system triggers valves to flow cold water through the bent pipes when a fermentation tank has reached a desired temperature.
Other data were obtained in the laboratory setting. The properties of the raw materials were analyzed before fermentation began. To collect alcohol concentrations, makgeolli samples were collected both in the morning (10 am) and afternoon (3 pm) every day. To determine the alcohol concentration of each sample, the Kjeldahl method was used. The sample was boiled in an Erlenmeyer flask on a Kjeldahl distiller. The distilled sample with pure water was measured by a densitometer.
There were 192 fermentation processes using 84 fermentation tanks during a year, and 1,859 data samples were collected. All samples were used for MLP learning.
MLP is a supervised learning algorithm. MLP, as shown in Figure 2, elicits the answer with a relatively low computational complexity after the initial learning process. Given a set of features and a target, it can learn a nonlinear function approximator for either classification (Chung et al., 2015; Omid et al., 2017) or regression (Bahmani et al., 2018; Hendrawan and Murase, 2011).
In this study, the Scikit-learn MLPRegressor was used to learn an MLP model and to evaluate it. Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems (Pedregosa et al., 2011). The MLPRegressor, which is a module of Scikit-learn, implements an MLP that is trained using back-propagation with no activation function in the output layer, and can also use the identity function as an activation function.
The features, targets, algorithms, learning rate, activation function, number of hidden layers, and number of neurons, should be chosen before the MLP learning. As shown previously, there are many factors influencing the quality of makgeolli. Although these could constitute the features needed for the MLP learning, some of them are simply obtained using laboratory tests. Since this study aimed to build a practical model, practical obtainability was a criterion to select features. The features for the MLP learning process were chosen in the manner indicated in Table 1. Because the MLPRegressor uses data in a range from 0 to 1, all features were standardized by the Standard Scaler module.
Table 1. Features for MLP learning
|Time||Elapsed time from the onset of fermentation|
|Titer||Initial titer of raw material|
|Acidity||Initial acidity of raw material|
|Water content||Initial water content of raw material|
|Temperature||Accumulated temperature in fermentation tank|
|Room temperature||Temperature in fermentation room at specific time|
The MLPRegressor supports various parameters for learning. It supports four activation functions and three solvers. It is possible to assign a number of hidden layers and a number of neurons in a layer. To create a proper model, various parameters were used for MLP learning. The numbers of hidden layers were 1 to 3, and the number of neurons ranged from 7 to 30. All the combinations of the parameters in Table 2 were used for MLP learning.
Table 2. Parameters for MLP learning
|Activation function||"identity," "logistic," "tanh," "ReLU"|
|Solvers||"lbfgs," "sgd," "adam"|
|Number of neurons||7-30|
Each learning cycle, which maintained the same configuration, was repeated three times with random data. The data were divided into three groups, namely, training, validation, and testing. First, thirty percent of the data was randomly chosen for testing of the developed models. Seventy percent of the remaining data was used for training, and the rest was assigned for validation to check for overfitting.
The mean-square-error (MSE) is an indicator of the performance of the model (Eq. (1)). A smaller MSE indicates better performance because the back-propagation algorithm operates to minimize it,
where n is the number of data points for the training data set. Herein, an observation is a true value, and a prediction is a calculated model value.
To evaluate each model, the coefficient of determination R2 was used. It is easy to get the R2 value because the score function of the MLPRegressor returns it. The R2 is defined as
where an observation is a true value, and a prediction is a calculated model value.
The coefficient of determination R2 was used to check overfitting. If a training model is overfitted, the R2 of the training set is higher than the R2 of the test set. Therefore, the best model has a high R2 and elicits a small difference between the R2 values of the training and test sets.
Results and Discussion
Characteristics of fermentation temperature
The temperature during fermentation is the most important variable in makgeolli production. Figure 3 shows the temperature changes of four tanks as an example. The first fermentation stage started on August 9, 2017, and the second fermentation started two days later. After the onset of fermentation, the temperature increased drastically. The fermentation tanks were controlled so that the temperature did not exceed 28°C for the first three days and 20°C for the two subsequent days.
As shown in Figure 4, the temperatures of the tanks were controlled differently. Until August 16, the temperatures were controlled to avoid cases where the temperature exceeded 28°C. After that day, the temperatures of tanks 1 and 4 were not lowered because the alcohol concentrations had not reached the expected level. All tanks showed very similar temperature patterns, but the alcohol concentrations of these two tanks were different. This demonstrates that temperature is not the only factor that affects alcohol concentration. Since each fermentation tank was controlled in the same environment, the difference could have come from the source.
Results of MLP model
The final, optimum MLP model used three hidden layers, a back-propagation solver, and the ReLU activation function. The back-propagation solver was implemented by Rumelhart et al. (1986). Each hidden layer had 11, 23, and 10 neurons, respectively. The coefficients of determination R2 of the best model with training and test sets were 0.94 and 0.93, respectively. Although there were models for which R2 was higher than 0.94 in the training set, their R2 values were much lower than 0.93 in the test set. The considerable difference between the mean R2 values of the training and test sets means imply that the model was overfitted, and these models were thus excluded. Figure 5 shows a comparison of the actual alcohol concentration, and the alcohol concentration estimated by the best model. The maximum and minimum errors were 1.82% and -2.12%, respectively, and the total MSE was 0.078%.
Figure 6 shows the alcohol concentration changes of seven tanks in two fermentation processes. The 101th process used three tanks (22, 23, and 24) and the 102th process used four tanks (50, 51, 52, and 53). The alcohol concentration values were predicted by the MLP model. All lines have a sigmoidal characteristic similar to the characteristic of the general response curve. Two fermentation processes are clearly distinguished in the alcohol concentration. The upper process (102th process) shows higher alcohol concentrations. Since each fermentation process is controlled with a similar temperature profile and uses the same source, the patterns should be similar.
As mentioned above, the alcohol content is the most important characteristic among all the quality characteristics (Kang et al., 2014). Since the model developed can estimate alcohol concentration with a high probability based on the use of only three factors of raw materials and temperatures in the room and fermentation tank, it is possible to utilize it practically. For example, it could predict future anomaly symptoms, or it could possibly suggest a control profile to obtain products with a better quality.
In order to utilize the model for these purposes practically, it needs a three-step process. The process is to obtain data systematically, to learn and improve a model, and to apply the model in the system automatically. The first step is already completed because the manufacturer has a temperature control system and analyzes the raw material before fermentation.
The second step is to relearn a model in accordance to the egrowth of data. Since a learning process takes time, it is important to decide the proper onset conditions. Computing power could be wasted if a new learning process starts every time minor data increases occur. It is also important to decide on the termination conditions in the case of a new model. Given that a new model would be overfitted, a comparative method is needed without human intervention. Therefore, an additional study is necessary to identify these conditions.
In the third step, there are several topics for further study. The third step is to apply the model. As mentioned above, the model could be used for various purposes. Among these applications, it is quite promising to predict future alcohol concentrations at specific points. Since the developed model can estimate current alcohol concentration, it needs future temperature data to predict future concentration. The data could be given based on a predetermined strategy to control the temperature of a fermentation tank. Alternatively, it is also possible to set up a new model to predict the future environment. Therefore, the determination of methodologies on how a model application could be developed is a good topic for further studies.
In this study, the alcohol concentration during fermentation was investigated. Data from a total of 1,859 samples were collected. An MLP model was developed with three hidden layers, a back-propagation solver, and the ReLU activation function. The MSE of the best MLP model with the total data was 0.078%. The results demonstrate that this model could help predict alcohol concentration. The next step is to utilize the model for optimization of the makgeolli production process. In future research, prediction models for other quality factors, such as pH, total acid, volatile acid, and total sugar content, will also be developed.
Conflict of Interest
The authors have no conflicting financial or other interests.