CORRUGATED BOX DAMAGE PREDICTION USING ARTIFICIAL
        NEURAL NETWORK IMAGE TRAINING
                              By
                         Sarah Holland
                          A THESIS
                         Submitted to
                 Michigan State University
         in partial fulfillment of the requirements
                      for the degree of
              Packaging – Master of Science
                             2023


                                     ABSTRACT
         This thesis proposes a novel packaging evaluation method using
corrugated box images and an Artificial Neural Network (ANN). An Artificial
Neural Network works in a way similar to that of neurons in a human brain: by
making connections between a trained dataset and the new data provided after
training. The ANN has been implemented in the industry in various ways but
limited in the packaging evaluation. This paper is focused on the corrugated box
damage prediction using ANN. By capturing the damaged corrugated box images
with an Artificial Neural Network, damaged products can be identified allowing a
decision to be made as to what type of package failure occurred. One of the
benefits to using an Artificial Neural Network to evaluate corrugated box images
is that it allows for the evaluation of package protection in a real distribution
environment as compared to a controlled lab setting. In turn, this reduces the cost
of testing, as the package failure will have been identified with the assistance of
the Artificial Neural Network, rather than full retesting to identify where damage
occurred. This process would also reduce costs associated with the usage of
materials for testing, due to the lower number of test samples required.


          This thesis is dedicated to the people who supported me through my
  education, and those who aren’t around anymore to celebrate with me. To my
family, my friends, my amazing cats, and my incredible professors: thank you for
               all of the support and encouragement along the way.
                                         iii


                            ACKNOWLEDGEMENTS
        I would like to acknowledge and give my warmest thanks to my Major
Professor Dr. Euihark Lee, who made this work possible. His guidance and advice
carried me throughout all the stages of my project. I would also like to thank my
committee members for letting my defense be an enjoyable moment, and for your
brilliant comments and suggestions, thanks to you.
        I would also like to give special thanks to my partner Timothy Roback
who knows my research almost as well as I do, and my family as a whole for their
continuous support and understanding throughout this journey. I wouldn’t have
been able to complete this thesis without all of your support, and for that I am so
grateful.
        Finally, I would like to acknowledge my roommate and lifelong friend
Cole Pauley, and my best friend Morgan Graham, who provided countless words
of support during the difficult parts of completing this degree. To my friends in
the industry, your passion is inspiring. For that, and your friendship, I am thankful
for each of you.
                                           iv


                                  TABLE OF CONTENTS
LIST OF TABLES ............................................................................................... vi
LIST OF FIGURES ............................................................................................ vii
LIST OF ABBREVIATIONS ............................................................................. ix
CHAPTER 1: INTRODUCTION ........................................................................ 1
  1.1 Objective ....................................................................................................... 4
CHAPTER 2: BACKGROUND .......................................................................... 5
  2.1 E-Commerce ................................................................................................. 5
  2.2 Package Evaluation: Lab Testing ................................................................. 8
  2.3 Package Evaluation: Field Testing ............................................................. 13
  2.4 Package Evaluation: Computer Simulation................................................ 17
  2.5 Artificial Neural Networks (ANN) .............................................................. 20
  2.6 Research Goal ............................................................................................. 25
CHAPTER 3: METHODS ................................................................................. 26
  3.1 Data Collection ........................................................................................... 27
  3.2 Data Preparation ........................................................................................ 33
  3.3 ANN Modeling ............................................................................................ 36
  3.4 Verification ................................................................................................. 39
CHAPTER 4: CASE STUDY ............................................................................ 43
  4.1 Data Collection ........................................................................................... 43
  4.2 Data Preparation ........................................................................................ 46
  4.3 ANN Modeling ............................................................................................ 49
  4.4 Results and Verification .............................................................................. 53
CHAPTER 5: CONCLUSION........................................................................... 63
BIBLIOGRAPHY ............................................................................................... 64
                                                       v


                                                LIST OF TABLES
Table 1. ISTA 3A Drop Sequence (ISTA, 2018).................................................. 30
Table 2. Methodology Drop Orientations. ............................................................ 31
Table 3. Image Modifications on Television Images from E-Commerce
Platform................................................................................................................. 35
Table 4. Exhaustive Search Method Example for Neurons per Hidden Layer. .... 38
Table 5. Exhaustive Search Method Example for Activation Function and
Solver. ................................................................................................................... 39
Table 6. Corrugated RSC Sample Dimensions. .................................................... 45
Table 7. Number of Images in Each Damage Category. ...................................... 46
Table 8. Image Modifications on Corrugated RSC Images. ................................. 47
Table 9. Number of Images per Damage Category for Testing and Training
Split. ...................................................................................................................... 50
Table 10. Neurons per Layer High Saturation Cropped. ...................................... 51
Table 11. Neurons per Hidden Layer in Each Image Modification Category. ..... 52
Table 12. Activation Function and Solver for High Saturation Cropped. ............ 53
Table 13. Prediction Table Output from ANN. .................................................... 56
                                                              vi


                                   LIST OF FIGURES
Figure 1. Package Optimization in terms of cost. ................................................... 2
Figure 2. Retail e-commerce sales worldwide from 2014 to 2024 (in billion US
dollars) (International Trade Administration, 2023). .............................................. 6
Figure 3. Touchpoints of brick-and-mortar and e-commerce distribution (Skyline
University College, 2016). ...................................................................................... 7
Figure 4. Drop testing machine from Lansmont (Lansmont, 2023). ...................... 9
Figure 5. Vibration test system from Lansmont (Lansmont, 2023). ..................... 10
Figure 6. Compression tester from Lansmont (Lansmont, 2023). ........................ 12
Figure 7. Shock and vibration data logger from Lansmont (Lansmont, 2023)..... 14
Figure 8. Environmental monitoring tool from Sensolus. (Sensolus, 2023). ....... 15
Figure 9. Comparison of movement of unit load components under deflection for
experimental (left) and FEM (right) for two layers and (a) two columns, (b) three
columns, and (c) four columns of packages (Molina, 2021). ............................... 19
Figure 10. Conceptual diagram of an ANN. (UpGrad, 2022). ............................. 21
Figure 11. Conceptual diagram of calculations made within an ANN
(Obuchowski, 2020). ............................................................................................. 22
Figure 12. Commonly utilized activation functions: Identity (a), ReLu (b), Tanh
(c), and Logistic (d) (Šegota, 2020). ..................................................................... 23
Figure 13. Overview of methodology. .................................................................. 27
Figure 14. Conceptual diagram of the Web Scraping process. ............................. 28
Figure 15. Sample labeling example. .................................................................... 30
Figure 16. Box Compression Test Example (Rycobel, 2023). ............................. 33
Figure 17. ANN Hidden Layers. ........................................................................... 36
                                                  vii


Figure 18. Conceptual Confusion Matrix (Draelos, 2019). .................................. 40
Figure 19. Prediction Workflow Overview. ......................................................... 41
Figure 20. Sample labeling example. .................................................................... 44
Figure 21. Examples of images from the "High Contrast" modification
category. ................................................................................................................ 49
Figure 22. Contour Plot for High Contrast Cropped Predictive Accuracies......... 51
Figure 23. Predictive Accuracies of Corrugated RSC Images in each modification
category. ................................................................................................................ 54
Figure 24. Images from the Highest Predicted Categories: “Black and White
Original” (A), “High Saturation Original” (B), “Low Contrast Cropped” (C), and
“High Contrast Cropped” (D). .............................................................................. 54
Figure 25. Confusion Matrices for Black and White "Original" (A) and "Cropped"
(B). ........................................................................................................................ 58
Figure 26. Incorrectly predicted edge images from the category "Black and White
Cropped". .............................................................................................................. 58
Figure 27. Confusion matrices for the highest predicted categories: Black and
White Original (A), High Saturation Original (B), High Saturation Cropped (C),
and Low Contrast Cropped (D)............................................................................. 60
Figure 28. Images that could fit in either "Edge" or "Corner" prediction
categories. Image A: Corner Drop, Image B: Edge Drop. .................................... 62
                                                             viii


                       LIST OF ABBREVIATIONS
ANN Artificial Neural Network
RSC  Regular Slotted Container
FEM Finite Element Method
ReLu Rectified Linear Unit
SGD Stochastic Gradient Descent
LB   Lower Bound
UB   Upper Bound
NN   Neural Network
B&W Black and White
                                 ix


                     CHAPTER 1: INTRODUCTION
        Packaging is used in every industry to ensure protection, containment,
convenience, and communication to consumers and manufacturing professionals.
Protection and containment refer to the product itself by ensuring that the product
will arrive at its destination intact and in a damage-free and manageable way.
Convenience and communication refer to the use of each package. Consumers must
be able to utilize packaging in a convenient way while also having all pertinent
information provided to them. There are three layers of packaging commonly used.
These include primary packaging, which is the packaging in direct contact with the
product; secondary packaging, which is used outside of primary packaging to group
products together; and tertiary packaging, which is used by wholesalers for
shipping products to their destination while avoiding damage. These layers of
packaging play a fundamental role through the supply chain. All products undergo
distribution, which can involve rough shipping environments. Ensuring that a
product’s package will meet each of the four functions of packaging and will assist
in facilitating smooth distribution. To ensure this, packaging evaluation is used
before distribution.
        Packaging evaluation is very important for packaging cost and optimization.
When considering the type of packaging material to be used, there are many costs,
including the material costs and manufacturing costs, to consider. For example, a
                                         1


package that is created with a heavier-weight corrugate will cost more to produce
than one with a lighter-weight corrugate. This also impacts the optimization of the
package. Figure 1 below shows a chart of damage cost vs. package cost. If a
package cost is low but the damage is high, the product is under packaged. On the
opposite end of this, if the package cost is high but the damage cost is low, the
product is over-packaged. Optimization occurs between these when the damage
cost and package cost are equal, or close to equal. Ensuring that a package is within
this optimal zone will result in a successfully protected product through the supply
chain.
                 Figure 1. Package Optimization in terms of cost.
                                          2


        Packaging evaluation is an important step in any product development
process to ensure that a product will survive the various stressors of distribution.
The types of distribution include brick-and-mortar and e-commerce. Brick-and-
mortar distribution refers to distribution to a physical store where customers browse
and make purchasing decisions in person, while e-commerce distribution refers to
the distribution of products that are purchased online and shipped directly to the
consumer. Stressors in distribution can include shock and vibration, among others.
Various physical tests, including vibration, shock, and compression testing are
conducted on new package designs to ensure protection before distribution.
        Currently, the packaging evaluation process is heavily reliant on mechanical
testing in a controlled lab setting. Studies have been conducted to evaluate the
importance of physical testing on packages in this manner (Nygårds M, 2019;
Fadiji, 2016). Other methods of evaluation include field testing where a package
will undergo the physical stressors of distribution, and computer simulation where
a package is simulated and various properties are tested without physical testing.
Lab testing will ensure that a product is protected under controlled conditions, but
it does not account for the true stressors of distribution. Field testing, while
conducted in the true stressors of distribution, cannot account for the many
variables that are encountered in real-world environments. Computer simulations
can decrease the cost of physical testing, but their results are not always reliable.
Another method that has not yet been widely used for packaging evaluation is
                                            3


Artificial Neural Networks (ANN). This tool can be used for many applications but
is used in this thesis to predict the cause of corrugated Regular Slotted Container
(RSC) damage from images.
1.1 Objective
        Package evaluation using ANN has been implemented in the packaging
industry, but not as widely for the visual analysis of a package. This thesis aims to
develop an ANN model that can predict the cause of package damage from images
of corrugated regular slotted containers (RSCs). To do this, various key objectives
were identified as follows:
    1. Develop damage prediction model by implementing machine learning tools.
    2. Build ANN modeling process for corrugated RSC damage prediction.
    3. Find optimal image modification for ANN modeling.
                                           4


                       CHAPTER 2: BACKGROUND
        Packaging has been utilized since ancient times to ensure that product
quality is maintained in their route to the consumer. In the ancient era of packaging,
reed baskets, wineskins, wooden boxes, pottery vases, and more natural material
containers were utilized. The first set-up boxes were used in the 16th century
(Twede, 2005). Since ancient times and the 16th century, countless advances have
been made to improve the functions of packaging in every industry. Materials have
been modified to better fit the needs of distribution, new technologies have been
introduced, and testing standards have been created and accepted by the industry.
These standards ensure that a package meets the needs of the product contained
while withstanding various stressors of distribution. Innovation is constant, though,
meaning that the widely accepted ways of developing packages must continue to
be developed in order to fit the needs of each company and consumer.
2.1 E-Commerce
        E-commerce is the term for the buying and selling of goods and services
over the internet. This tool has recently become a heavily utilized resource in the
supply chain for a variety of products. This style of distribution relates directly to
the convenience function of packaging, as consumers can obtain a higher level of
convenience by not having to go in person to search for the product they want.
There are several advantages to utilizing e-commerce, including its global reach.
This distribution style has been even further utilized since the start of the COVID-
                                            5


19 global pandemic, as consumers faced closings of brick-and-mortar stores and a
higher likelihood of avoiding public spaces (Jílková, 2021). The quarterly share of
total U.S. e-commerce retail sales has grown from 9.8% in the second quarter of
2018 to almost 15% of total sales in the third quarter of 2022 (Coppola, 2022).
Research by the International Trade Admission suggests that the global growth of
e-commerce will continue, reaching a point of approximately $7,000 billion in sales
by 2024 (International Trade Administration, 2023). A chart for this growth is
shown below in Figure 2.
  Figure 2. Retail e-commerce sales worldwide from 2014 to 2024 (in billion US
                dollars) (International Trade Administration, 2023).
        This method of order fulfillment endures greater stressors throughout the
supply chain than brick-and-mortar because of the tougher environment and
hazards that packages encounter. In brick-and-mortar distribution, there are a
minimum of 4 handling points compared to a minimum of 11 handling points in e-
                                          6


commerce distribution (Skyline University College, 2016). Figure 3 below shows
an example of the touchpoints for each of these distribution types.
 Figure 3. Touchpoints of brick-and-mortar and e-commerce distribution (Skyline
                               University College, 2016).
A greater number of touchpoints means there are more opportunities for damage to
occur through e-commerce distribution. Product deformations as a result of shock
through e-commerce were observed at a rate of 19.3% in a study conducted by
Spruit et. al (Spruit, 2021). Additionally, product deformations as a result of impact,
shock, and static load through e-commerce were observed at a rate of 10.5%
(Spruit, 2021). These damages can result in a higher cost of replacing products,
especially when considering that a new product will have to endure the same supply
chain in its route to the consumer. Additionally, when damage is observed a
package must undergo retesting to ensure that future damage is avoided. This
                                            7


increases costs due to the number of samples required while testing, as well as the
labor associated with these tests. As shown, it is very important that packaging is
developed with protection through e-commerce distribution environments in mind.
This can be achieved through package development and evaluation with a goal of
ensuring the four functions of packaging.
2.2 Package Evaluation: Lab Testing
        There are three main approaches to ensure the packaging functions: lab
testing, field study, and computer simulation. The lab test conducted in a controlled
laboratory environment follows standards that have been widely accepted in the
packaging industry. These standards specify the exact methods for evaluating
packages for various purposes. For example, ISTA 6-Amazon.com-SIOC is the
standard test method for products that are meant to ship in their primary packaging
through the Amazon e-commerce platform (ISTA, 2016). There are many standards
for different distribution environments and each of these standards specifies the
number of samples that should be used to ensure thorough results (ASTM
International, 2016; ASTM International, 2022) . These standards have been set up
for all types of physical testing, including drop testing, vibration testing, and
compression testing. A common standard for conducting drop testing is the ISTA
3A standard for Packaged-Products for Parcel Delivery System Shipment (ISTA,
2018). In this procedure, a series of drops and vibrations are conducted on a single
                                          8


container using a drop tester (Figure 4) and vibration table. For this example, only
the drop portions will be explained.
         Figure 4. Drop testing machine from Lansmont (Lansmont, 2023).
        Packages are loaded with the product including all primary packaging.
Then, the package undergoes a series of drops from heights ranging from 18-36”.
The package is positioned for these drops so that it encounters impact on various
faces, corners, and edges of the corrugated case to ensure that a thorough evaluation
of damage can be performed.
        A commonly followed standard for vibration testing is ASTM D4728, the
Standard Test Method for Random Vibration Testing of Shipping Containers
(ASTM International, 2022). In this procedure, packages undergo a series of
random vibrations on a vibration (or shaker) tester to determine how they should
                                          9


perform through distribution. Figure 5 shows an example of what this vibration
tester looks like.
         Figure 5. Vibration test system from Lansmont (Lansmont, 2023).
         Packages are loaded with the product and all associated primary packaging.
The package is placed in either a horizontal or vertical orientation related to the
direction of corrugate fluting, and supports are positioned to ensure that the package
will not vibrate off of the table while still allowing space for movement. Then,
random vibrations are applied for a predetermined amount of time which is specific
to each individual or group performing the test. In this standard, the vibration rates
and times can be adjusted to meet the needs of a specific distribution environment.
For example, if a package must travel for 2 hours to meet its destination and the
shipping route is known, the vibration test can be conducted for 2 hours under
                                         10


vibration rates similar to those on its route. Many additional types of vibration
testing can be conducted. These include sine sweep testing, which involves
subjecting a package to a vibration that gradually increases over time; fixed
frequency testing, which involves subjecting a package to a constant level of
vibration; and resonance search testing, which involves identifying the resonant
frequency of a package and subjecting the package to vibrations around that
frequency. Additionally, random vibration and sine sweep vibration testing can be
combined for sine on random testing. In this method, a package is subjected to
random vibrations that coincide with a single frequency of vibration. This test
allows the professional to identify how a package will perform both at random
vibrations and a set vibration over time.
        Compression testing is commonly conducted following ASTM D642-20,
the Standard Test Method for Determining Compressive Resistance of Shipping
Containers, Components, and Unit Loads (ASTM International, 2020). In this
testing method, packages are loaded with the product and any associated primary
packaging and placed on a compression tester (Figure 6).
                                          11


         Figure 6. Compression tester from Lansmont (Lansmont, 2023).
       The package is placed either horizontally or vertically relative to the
alignment of corrugate fluting and undergoes one compression. The test stops in
one of two ways: once the compression reaches a predetermined point, or once a
point of failure has been achieved. Conducting this type of test can assist in
determining the strength of a container when stacked for long periods of time.
        Retesting that must occur if large amounts of damages are reported can
become very expensive and time-consuming as a result of these standards. While
the widely accepted standards for lab testing are thorough, they are not without
drawbacks. A study by Frank et.al discusses the limitations of compression testing
                                        12


in-lab on corrugate boxes when compared to their real-world environments and
found that lab testing alone is not enough to measure the true functionality of a
package through the supply chain (Frank, 2010). It can be assumed that this holds
true when comparing brick-and-mortar to e-commerce distribution. To account for
this, many packaging professionals utilize field testing for further evaluation.
2.3 Package Evaluation: Field Testing
         Field research on package design can show the truest functionality of a
package as it moves through the supply chain. This method is commonly used after
lab testing has been completed. In this evaluation method, products are packaged
in the way that they are expected to enter the supply chain. They are then sent out
with either a traditional shipping source, like USPS or UPS, or with company
owned distribution tools. One benefit of conducting field testing with company
owned tools is that the test can provide a view of how a package will perform in a
more controlled environment than that of a typical distribution service. However,
field testing is not always the closest view to actual distribution. When utilizing this
method, all factors of distribution are set and controlled by the package producer.
These factors are not always consistent with how distribution will actually occur,
resulting in a higher chance for damage after validation testing. This is where it
may be of benefit to use a traditional shipping source like USPS or UPS. By sending
a package out into an uncontrolled distribution environment, packaging
professionals can evaluate how a package will truly perform. Information can be
                                           13


gathered from field testing in numerous ways, including vibration and shock
records with data logging tools, and physical damage depending on the field test
administered. An example of a shock and vibration data logger from Lansmont is
shown in Figure 7 below.
   Figure 7. Shock and vibration data logger from Lansmont (Lansmont, 2023).
With this tool, shock and vibration data from a distribution route can be recorded
and saved. This can be used in conjunction with lab testing to better simulate the
environment that a package will undergo.
        Certain products may have attributes that cannot exceed a set level of
humidity or temperature. To ensure that a package is meeting the protection needs
of a product like this, environmental monitoring tools can be utilized in conjunction
with field testing. Figure 8 (below) shows an example of one of these tools that can
be implemented in field studies.
                                         14


    Figure 8. Environmental monitoring tool from Sensolus. (Sensolus, 2023).
A tool like the one shown in Figure 8 can store data about humidity, temperature,
contact, and orientation. This information can be extracted with the specific route
that was taken, allowing for evaluation of package needs for a product meant to
follow that same route. Data collected from tools like those shown in Figures 7 and
8 can assist in optimizing packaging by ensuring that the cost of the package is
viable when considering the type of product to be packaged. This is especially
important when considering medical devices and electronics. The cost of each
product is high, meaning the packaging must be of high enough quality to ensure
                                         15


that losses are not observed in the form of product damage as a result of insufficient
packaging.
         Many researchers conducted their research through field study (Dunno,
2017; Jung, 2012) since studies like these are the closest possible test to a real-
world application. A study by Zhong et.al focuses on corrugate boxes through
shipping conditions in China and found that a majority of packages aren’t placed in
the correct position during shipping, resulting in a greater number of package drops
(Zhong, 2016). The placement of a package through distribution cannot always be
controlled, so adjusting packages to meet protection needs in uncertain
environments can assist in ensuring package success. Conducting testing in the field
quickly becomes costly and time consuming due to the number of resources
required, especially when considering the cost of the product contained. Moreover,
these studies cannot measure and evaluate the functionality and protection of the
package due to the lack of consistency in shipping conditions. These environments
can change as shipping routes change, as well as when the environment changes
the conditions of a route naturally.
                                          16


2.4 Package Evaluation: Computer Simulation
        An alternative to lab and field testing is the use of computer models.
Computer models have become a popular method for simulating and evaluating
packaging test methods. There are many kinds of computer simulations, including
computational fluid dynamics, which can be used to model the flow of air, liquid,
or gas inside a package; moldflow analysis, which can be used to predict how a
plastic package will be molded during manufacturing; drop and impact simulation,
which can predict the damage on a package when dropped; and thermal analysis,
which can be used to evaluate the temperature distribution inside a package through
various environments. Another common form of computer simulation is the finite
element method (FEM). Many studies have been conducted simulating various
stages of the supply chain that a package must endure utilizing FEM, which is a
computer modeling technique that can be used to simulate the stresses and strains
placed on a package during distribution and storage. This tool has been used in
previous research to simulate the behavior of multiple materials to evaluate their
performance in various applications (Rowson, 2008; Mills, 2005; Hallbäck, 2014;
Huang T. C., 2022). This technique can help to identify areas of a package that are
most vulnerable to damage, allowing for a packaging professional to evaluate and
adjust the package before production. A study by Molina et.al. details a friction-
driven FEM model that simulates the load bridging effect of unit loads stored in
warehouse racks (Molina, 2021). Figure 9 shows a comparison of movement of unit
                                        17


load components under deflection for in-lab experiments (left) and with FEM
(right). In this figure, FEM images are labeled based on deflection, with red
meaning large displacement and blue meaning no movement. As shown with the
figure, utilizing FEM to simulate the behaviors of packages in various
environments can provide insightful data for how pallet loads or packages must be
adapted to survive different conditions.
                                         18


Figure 9. Comparison of movement of unit load components under deflection for
experimental (left) and FEM (right) for two layers and (a) two columns, (b) three
           columns, and (c) four columns of packages (Molina, 2021).
                                       19


         A study by Biancolini et.al focuses on the buckling strength of corrugated
boxes with FEM and found that the model provides negligible error when compared
to a lab-tested value under the same conditions (Biancolini, 2003). Based on these
results, adopting this approach would imply that the monetary costs associated with
physical testing can be cut back. While the benefits of utilizing FEM have been
observed, this process is not perfect. Materials have many properties that can be
changed as their chemical composition is altered. Accounting for each of these
properties involves a large variety of simulations to be performed. This can increase
the computational cost, and in many cases, the results are not reliable due to the
simulations being based on estimations of real-world conditions. Because of this, a
new method is needed to evaluate packages that endure the stressors of e-commerce
to ensure customer satisfaction.
2.5 Artificial Neural Networks (ANN)
         Machine learning (ML), a subset of Artificial Intelligence, is a field of
computer science that studies algorithms and techniques for automating solutions
to complex problems that are difficult to program using conventional programming
methods (Rebala, 2019). There are many types of machine learning algorithms,
each with varying levels of supervision required by the user. The first of these is
supervised learning, where the data has already been labeled or classified with the
correct output. Unsupervised learning involves an algorithm that is trained on
unlabeled data. Semi-supervised learning contains a combination of labeled and
                                          20


unlabeled data. Finally, reinforcement learning involves an algorithm that learns
from positive and negative feedback for certain actions. Within ML exists Artificial
Neural Networks (ANN). ANNs can be utilized to make assumptions and
predictions about various datasets and work in a way similar to that of a human
brain, by making connections between characteristics of data points and drawing
conclusions. ANNs consist of an input layer which receives data, one or more
hidden layers where neurons are contained and make connections within the data,
and an output layer where a result is provided. A conceptual diagram of this is
shown in Figure 10.
           Figure 10. Conceptual diagram of an ANN. (UpGrad, 2022).
        Each neuron in the hidden layer(s) applies mathematical functions to the
data and passes the results of these functions to other neurons following the flow
of the network between layers. These functions are a product of a neuron’s value,
                                        21


as determined by an activation function, multiplied by the weight of each neuron
which is determined by the solver. All values are added to provide the output data.
Figure 11 shows a conceptual diagram of the calculations made between neurons,
as well as the equation used for providing an output.
  Where:
  x: numerical value
  of the neuron from
  activation function
  w: weight of each
  connection from
  solver
  h: output value
        Figure 11. Conceptual diagram of calculations made within an ANN
                                (Obuchowski, 2020).
        Activation functions and solvers are contained between each neuron in the
ANN. Activation functions introduce nonlinearity into the output of a neuron,
allowing for the ANN to model complex nonlinear relationships between input and
output data. Commonly used activation functions include identity, logistic
(sigmoid), tanh (hyperbolic tangent), and rectified linear unit (ReLu). Graphs for
each of these activation functions can be found below in Figure 12.
                                         22


  Figure 12. Commonly utilized activation functions: Identity (a), ReLu (b), Tanh
                         (c), and Logistic (d) (Šegota, 2020).
          Drawing on the state of research to date, Dubey et.al. posits the following
explanations for various activation functions. Identity is a linear activation function
where the output of the function is equal to the input. Logistic, or sigmoid, maps
input to a value between 0 and 1 and is commonly used in probability models. Tanh
is a hyperbolic function that maps the value of an input between -1 and 1. ReLu is
a piecewise linear function that returns the value of an input if it is positive, and a
0 if it is not (Dubey, 2022). Solvers are used to optimize the weights between each
neuron. Three commonly used solvers include stochastic gradient descent (SGD),
                                           23


L-BFGS-B, and Adam. SGD is an optimization algorithm that updates the weights
and biases of the network based on the loss function with respect to each parameter
(Bottou, 2012). L-BFGS-B is a quasi-Newton optimization algorithm that
approximates the Hessian matrix of the loss function to update weights and biases
within the network (Liu, 1989). Adam is a variant of SGD that adapts the learning
rate of each weight based on the first and second moments of the gradient
(Brownlee, 2017). Each activation function and solver provides its own benefits
depending on the application. Benefits of ANN as a whole include optimization,
predictive accuracy, and time savings. Applications of this tool can be found in
various industries, including marketing (Bloom, 2005). ANNs can also be utilized
for image-based evaluations. Kalyan et. al. details a method for diagnosing disease
conditions with the use of ANN (Kaylan, 2014). While the functionality of ANN
has been studied extensively, this application has not yet been widely brought to
the packaging industry. There have been instances of this tool being utilized in
terms of packaging analysis (Esfahanian, 2022; Xie, 2023; de Abajo, 2004), but
little has been done in terms of visual analysis of a package with images. In this
thesis, a method for utilizing the image analysis capabilities of ANN for corrugated
RSC evaluation is introduced.
                                          24


2.6 Research Goal
         The goal of this research is to predict the cause of box damage using
corrugated RSC images with an artificial neural network (ANN). It is important to
know how package failure has occurred in an effort to improve protection
throughout the supply chain. Knowing the cause of damage allows professionals to
improve that area without conducting a full investigation, which involves total
retesting of the package in a controlled lab environment. If the package professional
utilizes field testing or simulations, largescale damage reports would also result in
an additional round of these tests and simulations to be performed, increasing
physical and computational costs further. By utilizing ANN for package evaluation,
it will be possible for packages to be evaluated in a real-world context, as opposed
to in-lab controlled testing. Additionally, the costs associated with testing will be
reduced due to the smaller number of samples required for post-production
evaluation. The ANN is able to predict what the cause of damage is for each
corrugated RSC, meaning a full retesting process is not necessary when damage is
reported. Industry professionals will be able to modify current packaging without
identifying the issue manually.
                                           25


                         CHAPTER 3: METHODS
        This chapter introduces a novel methodology of corrugated box evaluation
processing. Instead of using typical packaging test standard or numerical
simulation, image training with ANN was implemented as a main tool for
corrugated box evaluation. With this tool, the type of damage experienced by a
corrugated RSC can be predicted, allowing for a more streamlined evaluation
process. The packaging evaluation process using corrugated RSC images was
composed of 4 main processes: (i) data collection, (ii) data preparation, (iii) ANN
model development, and (iv) verification. Each process is important to ensure
optimized predictive accuracies with the ANN. Figure 13 shows an overview of the
methodology followed. The process begins with data collection, which involves
gathering images from previous validation testing, web scraping, or creating new
images in-lab. These images are then modified in various ways including cropping
to remove the background and modifying various color properties to fit into 12
different categories. Data is split into 70% for training and 30% for testing before
the ANN is modeled, which involves determining the number of hidden layers, the
number of neurons in each hidden layer, and the activation function and solver for
the model. Finally, predictions are verified using manual verification and confusion
matrix evaluation to ensure that all predictions can be considered valid. The
following sections explain the detailed process of each step.
                                          26


                        Figure 13. Overview of methodology.
3.1 Data Collection
         In the data collection portion of this method, there are three ways that
images can be gathered. The first of these is by utilizing images previously taken
in the validation stage of package development. Utilizing these images would cut
down on the cost of new samples required when creating images, but they aren’t
always created in a way that highlights one form of damage.
         When validation testing, industry standards like ISTA 3A allow for multiple
tests to be performed on a single sample (ISTA, 2018). While this test is appropriate
for validation of the package, it is not designed to highlight a single type of damage
to the container. The second approach is by utilizing a web scraping tool that
gathers information in various forms from a website. The forms of data that can be
extracted in web scraping include product prices, consumer reviews, images, and
more. Figure 14 (below) shows a diagram of the web scraping process.
                                           27


            Figure 14. Conceptual diagram of the Web Scraping process.
        To follow the process of web scraping, unstructured data in the form of a
link to a website is input to the web scraping tool. This tool must be programmed
to extract data that fits the needs of the user. Structured data can be output in many
forms, including images. In this method, images that are gathered must be labeled
manually based on what it looks like the damage could be. Consumers are not
always able to identify the specific kind of damage that their product has
experienced and typically will not mention the damage type in reviews. Packaging
professionals may be able to identify the possible cause of damage, but it may not
always be the true cause. When utilizing web scraping, it is important to recognize
that the legality and ethics of the tool have not yet been fully defined (Krotov,
2018). There is opportunity for this tool to extract data that can pose security risks
to consumers. Additionally, web scraping can result in the gathering of data that is
                                            28


licensed to a corporation. If utilizing this method, care should be taken to ensure
that all data gathered is done so in an ethical and legal way.
         The third approach is creating new images via testing. Utilizing this method
allows for specific kinds of damage to be tested and documented which ensures that
the image label will be true to the kind of damage shown. For this method
specifically, impact (drop) testing and compression testing are utilized. Vibration
testing was not considered for this method due to the minimal visual damages
observed through testing. To conduct impact testing, the ISTA 3A standard for
packaged products for parcel delivery system shipment was modified and followed.
In this standard, a series of drops and vibrations are conducted at various heights
and vibration levels, focusing on different orientations of the package being tested
(ISTA, 2018). For the purposes of this research, vibration portions of ISTA 3A
were not followed. Table 1 below shows an overview of the drop sequence
associated with the ISTA 3A test. Figure 15 below shows a sample labeling
example followed in this testing. When following the drop sequence, the corrugated
RSC is oriented so that the drop occurs on the edge or corner where the numbered
faces meet. For example, if the drop orientation states “Edge 4-6,” that would
correspond to the edge between faces 4 and 6 on the RSC.
                                          29


                  Table 1. ISTA 3A Drop Sequence (ISTA, 2018).
           Drop #        Samples <70 lbs. (32 kg)      Orientation of Drop
              1               18 in (460 mm)                 Edge 3-4
              2               18 in (460 mm)                 Edge 3-6
              3               18 in (460 mm)                 Edge 4-6
              4               18 in (460 mm)               Corner 3-4-6
              5               18 in (460 mm)               Corner 2-3-5
              6               18 in (460 mm)                 Edge 2-3
              7               18 in (460 mm)                 Edge 1-2
              8               36 in (910 mm)                  Face 3
              9               18 in (460 mm)                  Face 3
                       Figure 15. Sample labeling example.
       Face drops were not included due to the minimal visual damage observed
through testing. Each drop orientation was conducted on a new sample from a
height of 36" to ensure that the true extent of one type of damage was observed.
These drops were also expanded to include each corner and edge orientation
possible for the corrugated RSC. In total, 14 drop orientations were included in the
drop testing process. Table 2 below shows the drop orientations for this
methodology in detail.
                                         30


            Table 2. Methodology Drop Orientations.
  Orientation #         50 lbs. (22.7 kg)
(new sample each      loaded sample drop      Orientation of drop
      time)                  height
         1              36 in (910 mm)           Corner 3-4-5
         2              36 in (910 mm)           Corner 1-2-6
         3              36 in (910 mm)           Corner 1-2-5
         4              36 in (910 mm)           Corner 1-4-5
         5              36 in (910 mm)           Corner 1-4-6
         6              36 in (910 mm)           Corner 2-3-5
         7              36 in (910 mm)           Corner 2-3-6
         8              36 in (910 mm)           Corner 3-4-6
         9              36 in (910 mm)              Edge 1-2
        10              36 in (910 mm)              Edge 4-5
        11              36 in (910 mm)              Edge 4-6
        12              36 in (910 mm)              Edge 3-6
        13              36 in (910 mm)              Edge 1-5
        14              36 in (910 mm)              Edge 3-4
                               31


         Prior to each drop, a load of 50 lbs. (22.7 kg) was inserted into the case.
While this load may be greater than the expected weight of some products, it will
ensure that damage is shown for each drop administered. Images were taken from
multiple angles after the drop to capture the full extent of this damage.
         To conduct compression testing, ASTM D642-20 was modified and
followed (ASTM International, 2020). No weight was added to the case for this
test. Samples were loaded into a compression tester in either a horizontal or vertical
orientation in relation to the direction of the corrugate fluting with a preload of 50
lbs. and compression was applied to the case at a rate of 0.5 in/min until a yield of
50% from the maximum point of compressive load was achieved. This level of
yield may exceed the expected compression strength of many packages in use, but
it will ensure that the fullest extent of possible damage is shown. Photos were again
taken after each test from multiple angles to capture the full extent of damage.
Figure 16 shows an example of what this testing looks like in the lab with a vertical
case orientation. From here, collected data was prepared for modeling the ANN.
                                            32


           Figure 16. Box Compression Test Example (Rycobel, 2023).
3.2 Data Preparation
        In the data preparation process, images were sorted into categories
respective of their damage and modified. This sorting step is important in order to
provide the best chance at an accurate damage prediction by the ANN.
Modifications are important to ensure that the ANN can capture all features
associated with the damage observed. First, images were sorted into a category that
corresponds to the type of damage experienced. All drop images were split into
“Edge” or “Corner,” while all compression images were placed in a category
labeled “Compression.” Labeling and sorting these images in this manner allows
for simple data validation as well as organization before image modification. When
modifying, copies of each sorted damage type were made and placed into another
                                          33


category that specifies the type of image modification done. All modifications were
done using the computer’s included image modification program. The first
modification included was to crop each image manually. This was done to
determine how the background of each image impacts prediction results. By
conducting this process manually, it can be ensured that the minimum amount of
background is shown in each image. Once cropping was completed, five additional
categories of image modifications were performed. These modifications included
black and white, high contrast, low contrast, low exposure, and high saturation.
When utilizing the image modification platform included on the computer,
modification sliders for these modifications were moved to the most extreme
version of each category. All categories of image modifications are important to
include because it is not known how the ANN sees and evaluates images. The ANN
draws information from each pixel associated with the image, making it difficult to
determine which modification would be best for the prediction. Including and
testing each modification ensures that at least one modification category will
provide an accurate prediction. The modification category that predicts most
accurately will vary between sample properties, meaning this process will need to
be followed for each application. The following Table 3 shows an example of each
modification performed on images using a television image gathered from an e-
commerce platform. From here, the ANN can be modeled for use.
                                         34


Table 3. Image Modifications on Television Images from E-Commerce Platform.
                           Original                   Cropped
     Original
Black and White
  High Contrast
  Low Contrast
 High Saturation
 Low Exposure
                                     35


3.3 ANN Modeling
        Modeling the ANN involves splitting data, determining the neuron
structure, and determining the activation functions and solvers. Data were split into
70% for training and 30% for testing. Furthermore, each category of damage was
split with the same 70/30 setup to ensure that examples of each category are
included in both the testing and training portions. This was done manually to ensure
that each image modification category contained the same data split, providing a
true view of how the ANN predicts damage. The next step of ANN modeling was
determining neuron structure, which includes the number of hidden layers within
the network and the number of neurons in each hidden layer. Figure 17 below shows
an example of how hidden layers are set up within the ANN.
                           Figure 17. ANN Hidden Layers.
                                          36


        Following Figure 17, images are input to the first neuron on the left. These
images are embedded, and numerical values are extracted for each pixel associated
with the image that are pushed to the first hidden layer. Within this first layer,
activation functions are used to determine if the neuron should be activated for
evaluation or not. Then, a solver determines the weight associated with each
neuron. Values from the activation function and solver are multiplied to provide the
value for the next hidden layer connection point. This step repeats in the second
hidden layer, until all values are summed, and a prediction is output in the final
neuron on the right. This prediction is provided in the form of a damage category.
         The number of hidden layers that should be used has been researched
extensively and it has been found that two layers are sufficient for many
applications of an ANN since the potential number of neurons in each layer can be
large (Hecht-Nielsen, 1987; Kůrková, 1992; Huang G. B., 2003). Limiting the
number of hidden layers to two will allow for the prediction to be made with high
accuracy while limiting the computational cost of determining neuron numbers per
layer. Determining the number of neurons in each hidden layer was done with an
exhaustive search method which tests the predictive accuracy of the ANN at each
increment of neurons per hidden layer. An example of this process is shown below
in Table 4.
                                         37


   Table 4. Exhaustive Search Method Example for Neurons per Hidden Layer.
  NN in
  1st/2nd     LB1      LB1+    LB1+2         .         .     LB1+n      UB1
  Layer
   LB2       0.881      0.795    0.863      0.885     0.795     0.843     0.874
  LB2+      0.787      0.642    0.899      0.867     0.885     0.778     0.850
 LB2+2      0.849      0.733    0.881      0.831     0.776     0.687     0.841
     .       0.871      0.873    0.798      0.698     0.851     0.881     0.735
     .       0.777      0.647    0.805      0.805     0.795     0.770     0.815
 LB2+n      0.825      0.842    0.756      0.823     0.801     0.795     0.787
   UB2       0.632      0.881    0.793      0.776     0.884     0.856     0.739
  KEY:
  Δ: Increment               LB: Lower Bound
  NN: Neuron Number          UB: Upper Bound
        When utilizing this method, the upper and lower bounds of possible neuron
numbers must first be determined. The minimum and maximum bounds were
determined using the rule of thumb method which states that the number of neurons
per layer should be within the range of the number of datapoints available
(Karsoliya, 2012). Increments for the exhaustive search were decided according to
the minimum/maximum range. The combination to be utilized was then selected.
This method is being used to ensure that the neuron combination does not fall into
the local minimum. While this method can have high computational costs, it will
ensure that maximum predictive accuracy is obtained.
        The final step of ANN modeling was to determine the activation function
and solver to be used. The available activation functions were identity, logistic,
tanh, and ReLu. Activation functions decide whether or not a neuron should be
                                        38


activated for prediction while applying a mathematical function to the data. The
available solvers include L-BFGS-B, SGD, and Adam. Solvers are used to optimize
the parameters used in predictions by applying weights to each data value. The
combination to be used was determined using the same exhaustive search method.
Combinations of each activation function and solver were tested on the ANN and
the predictive accuracy was recorded. Then, the combination with the highest
predictive accuracy was selected. The combination selected for this method is tanh,
Adam. An example of this exhaustive search method is shown below in Table 5.
 Table 5. Exhaustive Search Method Example for Activation Function and Solver.
3.4 Verification
        The final step in this methodology was to verify that the results obtained are
accurate to the true labels of each image. This was done utilizing two methods:
manual verification and confusion matrix evaluation. When verifying manually,
image prediction labels were compared to the true label of their respective image.
By comparing each photo manually, the user can identify which image is not being
predicted correctly and identify any possible cause of the incorrect prediction.
Additionally, the ANN’s confidence rate in each prediction can be calculated as a
                                         39


result. A strong confidence rate for an incorrect prediction could mean that there is
an issue with the data, allowing the user to identify this. The second method of
verification is utilizing confusion matrices. These show the breakdown of predicted
categories vs. the actual categories of data, which can also assist in determining
which images are not being predicted correctly. An example of a confusion matrix
is shown below in Figure 18.
              Figure 18. Conceptual Confusion Matrix (Draelos, 2019).
        If a predicted positive datapoint is actually negative, the confusion matrix
would show this datapoint as a “false positive.” Evaluating these matrices can also
assist in determining which image modification category will work best for the
damage the user is looking to predict. For example, if one image modification
category is predicting at a higher rate than others for the category “corner,” it may
be best to focus on that image modification for future corner damage predictions.
        Figure 19 below shows an overview of the prediction workflow using the
Orange Data Mining software with Image Analytics add on (Orange Data Mining
                                          40


Platform, 2023). As a reminder, Training data consists of 70% of the data while
testing data consists of the remaining 30%. Following this figure, training images
are input to the widget labeled “Training Images.” These images are embedded to
extract numerical data which is pushed into the ANN widget to train the model. The
trained ANN model is connected to the prediction widget. Then, testing images are
uploaded to the “Testing Images” widget. These images are also embedded, and
numerical data for each is pushed to the prediction widget as data. Predictions are
made with the ANN’s trained dataset on the testing images, providing a table of
prediction results with the confidence rates, predicted damage category, and error
rates. A confusion matrix widget is attached to the prediction widget which can
show the overview of how images were predicted, both correct and incorrect data.
The image viewer widget connected to the confusion matrix widget can show
selected categories from the confusion matrix, allowing for simple verification of
the images predicted.
                    Figure 19. Prediction Workflow Overview.
                                        41


        The testing dataset, training dataset, and ANN widgets are also connected
to a “Test and Score” widget. This test and score widget provides the predictive
accuracy without a breakdown of each image, as well as the option to split data
randomly to evaluate how the ANN would perform without manual splitting of
data. This widget is helpful when determining how data should be split to provide
the most accurate predictions for each case, as well as when determining the neuron
structure as the breakdown of each predicted image is not necessary during that
step. The process of this ANN can be summarized as consisting of four steps: data
collection, data preparation, ANN modeling, and result verification.
                                          42


                         CHAPTER 4: CASE STUDY
        This case study was conducted utilizing corrugated regular slotted
containers (RSC). Images were gathered by conducting new testing in-lab with a
focus on one type of damage for every sample, ensuring that each image label was
accurate to the type of damage experienced. From there, images were cropped and
various color properties were modified, resulting in 12 image categories for
predictions. Results were obtained, and verification was conducted with two
methods: manual verification and confusion matrix evaluation to ensure that all
predictions were valid to consider. The following sections explain this process in
detail.
4.1 Data Collection
        In this data collection process, images were created from in-lab testing on
corrugated RSCs, which were selected given that they are one of the most common
shipping containers. Creating data in a controlled environment allows for accurate
labels of the damage that occurred, as well as a true example of each category of
damage. During this process, a total of 69 corrugated RSCs underwent compression
and impact testing. During impact testing, 43 samples were used in various sizes
shown in Table 6 below. These samples were labeled in uniform fashion before
testing following the example in Figure 20 to ensure consistency across samples.
Corrugated RSCs were loaded with 50 lbs. of weight and dropped from a height of
36" one time per sample, following a modified ISTA 3A test detailed in the
                                          43


methodology section. The drops were structured to ensure that one sample would
show damage from one component of the case: corner or edge. Additionally, each
of the 12 edges and 8 corners of a case were represented in the data. Images were
taken after the drop from multiple angles to ensure that all effects of damage are
shown. During the compression testing phase, 26 samples were used in various
sizes shown in Table 6.
                       Figure 20. Sample labeling example.
                                        44


                  Table 6. Corrugated RSC Sample Dimensions.
  Corrugated      Impact     Number    Corrugated      Compression      Number
      RSC         Testing       of         RSC            Testing           of
  Dimensions     Samples     Images    Dimensions        Samples        Images
  12"x12"x12"        5          17       12"x8"x8"           1              3
   14"x10"x8"        1           3      12"x10"x8"           1              3
  14"x10"x12"        9          25       14"x6"x8"           1              2
  14"x12"x12"        5          15       14"x8"x8"           2              6
   14"x14"x8"        1           3      14"x10"x8"           1              3
   14"x14"12"        2           6      14"x14"x8"           6             18
    16"x8"x8"        1           3       16"x8"x8"           1              3
  16"x12"x10"        1           3      16"x14"x8"           8             24
   16"x14"x8"        1           3      16"x16"x8"           5             15
  16"x14"x12"        6          18           -               -               -
  16"x16"x12"       11          34           -               -               -
        The preload associated with compression testing was 50 lbs. Compression
stopped once the cases reached a yield of 50% from the maximum point of
compression. This stopping point was set in the machinery but can also be
determined by evaluating the graph provided during testing. After the compression
load was applied, images were taken from multiple angles to ensure that all effects
of compression were recorded. From here, the collected data was prepared before
the ANN was modeled.
                                       45


4.2 Data Preparation
       In the data preparation process, images were first labeled to represent the
type of damage experienced. All drop images were labeled either “Edge” or
“Corner”, while all compression images were labeled into “Compression.” Table 7
below shows a distribution breakdown of how many images were sorted into each
category.
              Table 7. Number of Images in Each Damage Category.
                      Damage Category        # of Images
                             Edge                  70
                            Corner                 62
                        Compression                77
                            Total:                209
       Labeling images in this manner allows for simple data verification as well
as organization before image modification. When modifying, copies of each label
were created and placed into another category that specifies the type of image
modification done. All modifications were performed using the computer’s
included image modification program. Images were first manually cropped to
determine how the background will impact prediction results. Once this was
completed, five additional categories of modifications were performed on both the
cropped and original versions of images. These include black and white, high
contrast, low contrast, low exposure, and high saturation. Including every
modification shown will ensure that at least one of the workflows will capture
                                         46


details needed for the ANN to make a prediction. Table 8 below shows an example
of each modification performed on the lab created images.
            Table 8. Image Modifications on Corrugated RSC Images.
                                   Original             Cropped
                Original
            Black and White
             High Contrast
             Low Contrast
            High Saturation
             Low Exposure
                                       47


         Images were modified to the most extreme version of each category. In
some cases of image modification, it appeared as though all features that could
assist in predicting the cause of damage were muted. Figure 21 (below) shows an
example of three images from the “High Contrast” category of image modification.
In this modification, portions of the corrugated RSC can appear to be “blacked out.”
This can be beneficial for the ANN evaluation as the impacted areas become
highlighted. Image “A” of this figure shows one corner of the corrugated RSC
highlighted at the location of damage from the drop. Similarly in image “B,” a
majority of the frontmost face of the RSC is muted with the exception of the
damaged corner. In image “C,” the muted portions of the RSC bring more attention
to the bright face of the box as well as the split on the corner. These examples show
that while an image modification may not seem like the best representation, there
are still useful details that can assist in predicting the damage type.
                                             48


                                  B
                                  B
         A                                                               C
       Figure 21. Examples of images from the "High Contrast" modification
                                       category.
4.3 ANN Modeling
        To begin modeling the ANN for use, data was first split into training and
testing portions. In total, 167 training images and 42 testing images were used.
When splitting the data, each drop and compression category was split into 70% for
training and 30% for testing to ensure that each damage category had images in
both ‘training’ and ‘testing’ portions. These data splits were performed manually
to guarantee that each remained the same across each image modification category.
This ensured that each accuracy was a consistent view of how the ANN made
predictions on the images provided. Table 9 below shows a breakdown of how each
damage category was split between training and testing.
                                          49


 Table 9. Number of Images per Damage Category for Testing and Training Split.
                                                # of Images
             Damage Category
                                       Training             Testing
                   Edge                    56                 14
                  Corner                   50                 12
               Compression                 61                 16
                   Total:                 167                 42
         The next step of this modeling was determining neuron structure, including
the number of hidden layers and number of neurons per hidden layer. This was done
using an exhaustive search method with a minimum bound of 30 and a maximum
bound of 100. The increment for testing combinations was chosen to be 10 and all
combinations were tested and evaluated for the local maximums. Table 10 below
shows an example of this exhaustive search for the category “High Saturation
Cropped.” As shown in this table, numerous occurrences of the local maximum
were observed, which indicate the highest predictive accuracies of the model. For
this research, the combination (40,30) was used as it was the first instance of this
maximum in the “High Saturation Cropped” category. Figure 22 shows a contour
plot of the exhaustive search for this category. The same exhaustive search method
was performed across all image modification categories, providing a different
number of neurons in each hidden layer for every category. A summary of the
results of these exhaustive searches can be found in Table 11 below.
                                           50


             Table 10. Neurons per Layer High Saturation Cropped.
          30        40          50          60           70          80    90   100
 30     0.863     0.844       0.863       0.863        0.883       0.905 0.863 0.863
 40     0.905     0.863       0.905       0.863        0.863       0.883 0.883 0.905
 50     0.816     0.863       0.883       0.883        0.844       0.881 0.883 0.905
 60     0.844     0.863       0.883       0.883        0.883       0.835 0.883 0.884
 70     0.883     0.826       0.905       0.853        0.863       0.879 0.844 0.863
 80     0.863     0.883       0.863       0.863        0.826       0.883 0.863 0.905
 90     0.883     0.816       0.883       0.863        0.883       0.863 0.863 0.844
100     0.905     0.905       0.835       0.905        0.905       0.883 0.863 0.883
                                                                  100
                                                                  90
                                                                  80
                                                                  70
                                                                  60
                                                                  50
                                                                  40
                                                                  30
                       30    40   50    60    70    80    90  100
                         0.75-0.8    0.8-0.85    0.85-0.9    0.9-0.95
    Figure 22. Contour Plot for High Contrast Cropped Predictive Accuracies.
                                            51


           Table 11. Neurons per Hidden Layer in Each Image Modification Category.
                               Black
                                           High         High          Low         Low
                    Original     &
                                         Saturation    Contrast     Exposure     Contrast
                               White
            1st
Original
                       90        40           70           60           70           70
           Layer
            2nd
                       30        40           60           40           70           40
           Layer
            1st
Cropped
                       100       50           40           30           40           50
           Layer
            2nd
                       70        30           30           30           30           30
           Layer
              Activation functions and solvers were determined using the same
      exhaustive search method. All combinations of the available activation functions
      and solvers were tested, and the predictive accuracies recorded for each
      modification category. Table 12 shows the results of this search method for the
      category “High Saturation Cropped.” As shown, the activation function ‘tanh’
      provided high predictive accuracy in conjunction with two solvers, L-BFGS-B and
      Adam. The solver ‘Adam” was selected as this combination provided high
      predictive accuracy throughout multiple categories of modification. This activation
      function and solver combination was used across all image modification categories.
      Once the ANN was modeled, two steps were taken to verify that the results obtained
      were accurate.
                                              52


           Table 12. Activation Function and Solver for High Saturation Cropped.
                                          Activation
                              Identity      Logistic        Tanh           ReLu
  Solver     L-BFGS-B          0.835         0.785          0.905          0.742
                SGD            0.835         0.338          0.801          0.878
                Adam           0.863         0.844          0.905          0.834
4.4 Results and Verification
            The ANN prediction accuracy results from this case study can be found in
Figure 23 below. The highest predictive accuracies were found in the categories
“Black and White Original,” "High Saturation Original,” “Low Contrast Cropped,”
and “High Saturation Cropped” at a rate of 91%. These accuracy rates show that
cropping the images does not always ensure higher predictive accuracy. In fact, the
two “High Saturation” categories predicted at the same rate. Throughout the whole
data, there were four instances where the cropped images predicted at an equal or
higher rate than their original counterparts.
                                            53


 Figure 23. Predictive Accuracies of Corrugated RSC Images in each modification
                                     category.
        Figure 24 below shows images from the four highest predicted categories
for comparison of cropped images vs. original.
                                                           C                    D
            A                     B
    Figure 24. Images from the Highest Predicted Categories: “Black and White
 Original” (A), “High Saturation Original” (B), “Low Contrast Cropped” (C), and
                          “High Contrast Cropped” (D).
        Evaluating Figure 24 provides some possible explanations for why cropped
images do not always predict at a higher rate than original images. When cropping
                                        54


images, it is common that the resolution decreases. This can make it more difficult
for the ANN to identify details and distinguish between similar attributes in an
image. Additionally, removing a majority of the background of each image can
cause loss of context. This is important for distinguishing between the area of
interest in an image and the background. In this case, the smaller background area
could cause the ANN to not know that the corrugated RSC in question is the
majority of the image, resulting in a prediction based on portions of the RSC instead
of the RSC in its entirety. While this wasn’t the case in the two cropped categories
that predicted at a high rate, it could explain the lower predictive accuracies in other
cropped categories. This further proves that it is important to include all image
modification categories when modeling the ANN since it is not completely certain
how the ANN views and analyzes images. To dive deeper into these results, two
verification methods were utilized.
        The first method of verification was manual. In this method, a table from
the data mining software’s prediction, shown below in Table 13, was utilized.
                                            55


                  Table 13. Prediction Table Output from ANN.
                Confidence Rates
 Image                                                         Actual     Image
          Compression Corner Edge Prediction Error
    #                                                         Category     Name
                                                                          Sample
    1         0.00         1.00     0.00    Corner    0.001    Corner
                                                                           42.2
                                                                          Sample
    2         0.00         1.00     0.00    Corner    0.002    Corner
                                                                           44.3
                                                                          Sample
    3         0.00         0.76     0.23    Corner    0.238    Corner
                                                                           25.3
                                                                          Sample
    4         0.04         0.00     0.96     Edge     0.998    Corner
                                                                           22.1
                                                                          Sample
    5         0.00         1.00     0.00    Corner    0.001    Corner
                                                                           50.2
                                                                          Sample
    6         0.00         0.99     0.00    Corner    0.005    Corner
                                                                           35.2
                                                                          Sample
    7         0.01         0.01     0.98     Edge     0.990    Corner
                                                                           48.1
    .           .            .        .        .         .        .          .
    .           .            .        .        .         .        .          .
                                                                          Sample
   42         0.07         0.93     0.01    Corner    0.073    Corner
                                                                            2.3
Table 13 provided the confidence level in the prediction made, shown in the
confidence rates columns where three values ranging from 0-1 are shown. The first
confidence level corresponds to the category “Compression,” the second to
“Corner,” and the third to “Edge.” For example, for row 4 of this Table, the ANN
had a confidence level of 0.04 that the image was compression damage, 0.00 that it
was corner damage, and 0.96 that it was edge damage. As shown from the
“Prediction” column compared to the “Actual Category” column, this prediction
was incorrect. This is also shown by the “Error” column which contains an error
                                         56


rate of 0.998 for this specific prediction. Analyzing confidence levels for each
prediction can show if the prediction is reliable to consider. If the ANN is not very
confident in a prediction, correct or incorrect, the image can be checked to ensure
that all modifications are accurate as well as that the sorting is correct. If the
confidence level is high, the prediction can be considered valid. Performing manual
verification is useful to ensure that each image was modified and sorted correctly
before confusion matrix verification.
        The second verification method used involved evaluating confusion
matrices. These were used to see how the ANN performed in a broader sense. To
evaluate the impact of the image background on prediction results, confusion
matrices for the categories “Black and White” original (A) and cropped (B) are
shown below in Figure 25. In both categories, all compression images were
predicted correctly. For the “Black and White Original” category (A), two corner
images were predicted to be edge damage, while two edge images were predicted
to be corner damage. In the “Black and White Cropped” category (B), one corner
image and one edge image were predicted to be compression damage, and 5 edge
images were predicted to be corner damage. In these cases, it is obvious that the
cropped category performed worse than the original category. A portion of the
incorrectly predicted images from the “Black and White Cropped” category can be
seen in Figure 26, below.
                                          57


               A
             B
Figure 25. Confusion Matrices for Black and White "Original" (A) and "Cropped"
                                       (B).
                                  B
                          A                               C
Figure 26. Incorrectly predicted edge images from the category "Black and White
                                    Cropped".
                                        58


         The images shown in Figure 26 are edge damage images predicted to be
corner damage. Image A has been predicted incorrectly across multiple
modification categories. In this case, it can be assumed that the background is not
what causes the ANN to predict incorrectly, as it was predicted to be corner damage
consistently. For image B, it could be assumed that the ANN chose the category
“corner” because the entire edge impacted is not visible in the image. This raises
questions about image C, though. The entire edge is visible in the image, but it was
still predicted to be corner damage. This could be due to the largest amount of
visible damage being shown in the corner closest to the camera. The black-and-
white image modification does not highlight the damage shown on the edge very
strongly in this case. As shown in the confusion matrices for the “Black and White”
category, it is not always the case that cropped images are predicted at a higher
accuracy rate than original photos of the same modification.
         Figure 27 shows confusion matrices from the highest predictive accuracy
categories Black and White Original (A), High Saturation Original (B), High
Saturation Cropped (C), and Low Contrast Cropped (D). All of these categories
were predicted at a rate of 91% accuracy.
                                          59


             A
               B
               C
               D
 Figure 27. Confusion matrices for the highest predicted categories: Black and
White Original (A), High Saturation Original (B), High Saturation Cropped (C),
                       and Low Contrast Cropped (D).
                                      60


        Three of these categories, Black and White Original (A), High Saturation
Original (B), and High Saturation Cropped (C) predicted all compression images
correctly. In the category Low Contrast Cropped (D), all but one compression
image was predicted correctly, with one being predicted to be in the category
‘corner.’ This implies that compression images are consistently predicted correctly,
regardless of the image modification performed. Most of the differences between
each matrix lie within the corner and edge image predictions. In the categories
Black and White Original (A) and High Saturation Cropped (C), two corner images
were predicted to be ‘edge,’ while two edge images were predicted to be ‘corner.’
For the category High Saturation Original (B), one corner image was predicted to
be ‘edge,’ one edge image was predicted to be ‘corner,’ and two edge images were
predicted to be ‘compression.’ In Low Contrast Cropped (D), one edge image was
predicted to be corner damage, and two corner images were predicted to be edge
damage. Drop testing a corrugated RSC can provide damage that extends beyond
the point of impact, resulting in a higher chance that the damage observed could fit
into multiple categories. Figure 28 below shows an example of one corner drop
image and one edge drop image from the “Original” category with damage that
could result in a prediction for either category. Image A in this figure is a corner
drop image and image B is an edge drop image. Both of these examples show
damage that extends beyond the point of impact. In image A, the left edge
associated with the corner drop shows signs of impact. In image B, the corner
                                          61


associated with damage is closest to the camera, showing the corner damage more
than the edge damage. Both of these images were part of the training group, but
they could have been predicted as either “Edge” or “Corner” damage if included in
the testing group. Additionally, there are many more options for RSC positioning
in impact testing compared to compression testing. In total, there were 14
orientations utilized in drop testing compared to the 2 orientations for compression
testing. Most of the damage observed in compression testing is in the form of a line
at the buckling point of the RSC that extends throughout the case. This kind of
damage is more consistent than impact testing, which could explain the difference
in predictions between the three categories of damage.
         A                                       B
      Figure 28. Images that could fit in either "Edge" or "Corner" prediction
              categories. Image A: Corner Drop, Image B: Edge Drop.
                                          62


                       CHAPTER 5: CONCLUSION
        This thesis proposes a method that uses Artificial Neural Networks (ANN)
to evaluate corrugated box damage using images. The process begins with
collecting data from previous validation testing images, e-commerce platforms, or
new images created in-lab using a web scraping tool. The data is then prepared for
ANN modeling by sorting each image into a damage category and modifying it to
fit into 12 image modification categories. The ANN is modeled through data
splitting, determining the neuron structure, and selecting an appropriate activation
function and solver. Data is verified through manual verification and confusion
matrix evaluation to ensure that all predictions are valid for consideration. The
results show that modifying images in various ways provides high predictive
accuracy in multiple categories of modifications. Adopting this approach offers
several benefits, including reduced expenses for retesting packages in case of
damage following the initial validation phase, exceptional predictive accuracy, and
streamlined processes due to the significant reduction in time required to assess
damaged packages. This method is novel to the packaging industry and can be
expanded upon. Future research ideas include expanding from lab-created images
to review images from an e-commerce platform and predicting damage from a
combination of validation test images and e-commerce review images.
Additionally, this methodology could be expanded to account for the numerous
packaging types used in the industry.
                                         63


                              BIBLIOGRAPHY
Šegota, S. B. (2020). Frigate Speed Estimation Using CODLAG Propulsion
       System Parameters and Multilayer Perceptron. NAŠE MORE: znanstveni
       časopis za more i pomorstvo, 117-125.
ASTM International. (2016, Dec 27). Standard Test Methods for Vibration
       Testing of Shipping Containers. Retrieved from ASTM Compass:
       https://compass.astm.org/document/?contentCode=ASTM%7CD0999-
       08R15%7Cen-US
ASTM International. (2020, Oct 22). Standard Test Method for Determining
       Compressive Resistance of Shipping Containers, Components, and Unit
       Loads. Retrieved from ASTM Compass:
       https://compass.astm.org/document/?contentCode=ASTM%7CD0642-
       20%7Cen-US
ASTM International. (2022, Feb 18). Standard Practice for Performance Testing
       of Shipping Containers and Systems. Retrieved from ASTM Compass:
       https://compass.astm.org/document/?contentCode=ASTM%7CD4169-
       22%7Cen-US
ASTM International. (2022, May 13). Standard Test Method for Random
       Vibration Testing of Shipping Containers. Retrieved from ASTM
       Compass:
       https://compass.astm.org/document/?contentCode=ASTM%7CD4728-
       17R22%7Cen-US
Biancolini, M. E. (2003). Numerical and experimental investigation of the
       strength of corrugated board packages. Packaging Technology and
       Science, 821-832.
Bloom, J. Z. (2005). Market Segmentation: A neural network application. Annals
       of tourism research, 93-111.
Bottou, L. B. (2012). The Tradeoffs of Large Scale Learning. Optimization for
       Machine Learning, pp. 351-368.
Brownlee, J. (2017, July 3). Gentile Introduction to the Adam Optimization
       Algorithm for Deep Learning. Retrieved from Machine Learning Mastery:
                                         64


        https://machinelearningmastery.com/adam-optimization-algorithm-for-
        deep-learning/
Coppola, D. (2022). E-commerce as share of total U.S. retail sales from 1st
        quarter 2010 to 3rd quarter 2022. Retrieved from Statistica: https://www-
        statista-com.proxy1.cl.msu.edu/statistics/187439/share-of-e-commerce-
        sales-in-total-us-retail-sales-in-2010/
de Abajo, N. D. (2004). ANN quality diagnostic models for packaing
        manufacturing: an industrial data mining case study. Tenth ACM SIGKDD
        international conference on Knowledge Discovery and data mining, (pp.
        799-804).
Draelos, R. (2019, Feb 17). Measuring Performance: The Confusion Matrix.
        Retrieved from Glass Box Medicine:
        https://glassboxmedicine.com/2019/02/17/measuring-performance-the-
        confusion-matrix/
Dubey, S. S. (2022). Activation functions in deep learning: A comprehensive
        survey and benchmark. Neurocomputing, 92-108.
Dunno, K. (2017). Effects of transportation hazards on high barrier flexible
        packaging films. Journal of Applied Packaging Research.
Esfahanian, S. &. (2022). A novel packaging evaluation method using seniment
        analysis of customer reviews. Packaging Technology and Science, 903-
        911.
Fadiji, T. C. (2016). Susceptibility to impact damage of apples inside ventilated
        corrugated paperboard packages: Effects of package design. Postharvest
        Biology and Technology, 286-296.
Frank, B. G. (2010). Compression testing to simulate real-world stresses.
        Packaging Technology and Science, 275-282.
Hallbäck, N. K. (2014). Finite element analysis of hot melt adhesive joints in
        carton board. Packaging Technology and Science, 701-712.
Hecht-Nielsen, R. (1987). Kolmogorov's mapping neural network existence
        theorem. International conference on Neural Networks (pp. 11-14). New
        York, NY: IEEE Press.
                                           65


Huang, G. B. (2003). Learning capabilty and storage capacity of two-hidden-layer
        feedforward networks. IEEE transactions on neural networks, 274-281.
Huang, T. C. (2022). Investigations of structure strength and ventilation
        performance for agriproduct corrugated cartons under long-term
        transportation trip. Packaging Technology and Science, 821-832.
International Trade Administration. (2023). Ecommerce sales & size forecast.
        Retrieved from International Trade Administration: Trade.gov
ISTA. (2016). Ships in Own Container (SIOC) for Amazon.com Distribution
        System Shipment. Retrieved from International Safe Transit Association:
        https://ista.org/docs/6AmazoncomSIOCOverview.pdf
ISTA. (2018). Packaged-Products for Parcel Delivery System Shipment 70 kg
        (150 lb) or less. Retrieved from International Safe Transit Association:
        https://ista.org/docs/3Aoverview.pdf
Jílková, P. &. (2021). Digital consumer behaviour and ecommerce trends during
        the COVID-19 crisis. International Advances in Economic Research, 83-
        85.
Jung, H. M. (2012). Effects of vibration fatigue on compression strength of
        corrugated fiberboard containers for packaging of fruits during transport.
        Journal of Biosystems Engineering.
Kůrková, V. (1992). Kolmogorov's theorem and multilater neural networks.
        Neural networks, 501-506.
Karsoliya, S. (2012). Approximating number of hidden layer neurons in multiple
        hidden layer BPNN architecture. . International Journal of Engineering
        Trends and Technology, 714-717.
Kaylan, K. J. (2014). Artificial neural network application in the diagnosis of
        disease conditions with liver ultrasound images. Advances in
        bioinformatics.
Krotov, V. &. (2018). Legality and ethics of web scraping. Twenty-fourth
        Americas Conference on Information Systems. New Orleans: Emergent
        Research Forum.
                                          66


Lansmont. (2023). Model 1800 Vertical vibration test system. Retrieved from
        Lansmont:
        https://www.lansmont.com/products/vibration/vertical/lansmont-standard-
        1800
Lansmont. (2023). PDT 80 Precision Drop Tester. Retrieved from Lansmont:
        https://www.lansmont.com/products/drop/lansmont-pdt-80
Lansmont. (2023). Saver(TM) AM-Asset Monitor Shock and vibration data
        logger. Retrieved from Lansmont:
        https://www.lansmont.com/products/data_loggers/saver_am
Lansmont. (2023). SqueezerPro Compression tester. Retrieved from Lansmont:
        https://www.lansmont.com/product/compression-testers/squeezerpro
Liu, D. C. (1989). On the limited memory BFGS method for large scale
        optimization. Mathematical Programming, 503-528.
Mills, N. J.-M. (2005). Finite element analysis (FEA) applied to polyethylene
        foam cushions in package drop tests. Packaging Technology and Science:
        An International Journal, 29-38.
Molina, E. H. (2021). Development of a friction-driven finite element model to
        simulate the load bridging effect of unit loads sotred in warehouse racks.
        Applied Sciences.
Nygårds M, S. S. (2019). Simulation and experimental verification of a drop test
        and compression test of a gable top package. Package Technology and
        Science, 32(7):325-333.
Obuchowski, A. (2020, April 16). Understanding neural networks 2: The math of
        neural networks in 3 equations. Retrieved from Becoming Human:
        Artificial Intelligence Machine: https://becominghuman.ai/understanding-
        neural-networks-2-the-math-of-neural-networks-in-3-equations-
        6085fd3f09df
Orange Data Mining Platform. (2023). Orange. Retrieved from Orange Data
        Mining: https://orangedatamining.com/
                                         67


Rebala, G. R. (2019). Machine learning definition and basics. An introduction to
        machine learning. Retrieved from Springer:
        https://www.springer.com/journal/10994
Rowson, J. Y. (2008). Modelling capping of 28 mm beverage closures using finite
        element analysis. Packaging Technology and Science, An International
        Journal, 287-296.
Rycobel. (2023). Box compression tester (BCT) - Compression strength tester.
        Retrieved from Rycobel: https://www.rycobel.com/products/box-
        compression-tester
Sensolus. (2023). Use cases: Condition Monitoring. Retrieved from Sensolus:
        https://www.sensolus.com/use-cases/transport-condition-monitoring/
Skyline University College. (2016). Packaging for a new era of e-commerce.
        Retrieved from Skuline University College:
        https://www.skylineuniversity.ac.ae/pdf/ecommerce/Bemis-eBook-
        eCommerce.pdf
Spruit, D. A. (2021). First market study in e-commerce food packaging:
        Resources, performance, and trends. Food Packaging and Shelf Life.
Twede, D. &. (2005). Cartons, Crates, and Corrugated Board. Seoul, South
        Korea: DESIGN HOUSE Incorporated.
UpGrad. (2022, Sep 22). Neural Network: Architecture, Components & Top
        Algorithms. Retrieved from UpGrad:
        https://www.upgrad.com/blog/neural-network-architecture-components-
        algorithms/
Xie, Y. C. (2023). Methodology selecting and packaing materials combined
        multi-sensory experience and fuzzy three-stage network DEA model.
        Packaging Technology and Science, 125-134.
Zhong, C. L. (2016). Measurement and analysis of shocks on small packages in
        the express shipping environment of China. Packaging Technology and
        Science, 437-449.
                                         68