Abstract
The progress in generative AI, particularly large language models (LLMs), opens new prospects in design and manufacturing. Our research explores the use of these tools throughout the entire design and manufacturing workflow. We assess the capabilities of LLMs in various tasks: converting text prompts into designs, generating design spaces and variations, transforming designs into manufacturing instructions, evaluating design performance, and searching for designs based on performance metrics. We identify and discuss the current strengths and limitations of LLMs, suggesting areas for potential enhancements. Additionally, we examine the ethical implications and propose strategies to mitigate risks associated with employing generative AI in design and manufacturing.
1. Introduction
Computer-aided technologies (CAx), which encompass computational tools for design, analysis, and manufacturing of products, have significantly influenced various industries such as automotive, aerospace, architecture, electronics, biomedical, and digital media. However, the full potential of CAx workflows is often hindered by challenges such as the need for extensive domain-specific knowledge and the time required to utilize CAx software packages or to integrate these solutions into existing workflows. The advent of generative AI, particularly large language models (LLMs), seems poised to help us overcome these obstacles. By automating or semiautomating each stage of the CAx process, LLMs can serve as a software copilot with intuitive and user-friendly interfaces such as interactive chats with multimodal inputs. This would streamline the use of CAx tools and integrate them more efficiently into broader workflows.
Our analysisundefined examines the standard CAx workflow to identify opportunities for automation or acceleration through generative AI methods. We dissect the workflow into five distinct phases, assessing the potential for efficiency and quality enhancements in each phase by integrating generative AI tools. The components under review are: (1) generating a design, (2) constructing a design space and design variations, (3) preparing and documenting designs for manufacturing, (4) evaluating a design’s performance, and (5) discovering high-performing designs within a given performance metric and design space.
Although it would be feasible to create specialized LLMs for CAx, our study highlights the benefits of using generic, pretrained models. We base our experiments on GPT-4,undefined a leading general-purpose LLM, to demonstrate its application in CAx workflows. This approach illustrates how LLMs can streamline and accelerate the design and production of complex objects. Our analysis further explores how LLMs can integrate with existing solvers, algorithms, tools, and visualizers to create a cohesive workflow. Additionally, we identify current limitations of LLMs in design and manufacturing, pointing out avenues for future enhancements in both the models and the augmented workflows. Finally, we address the ethical considerations and dual-use risks associated with these tools, along with potential strategies for mitigation.
2. Evaluation of Current LLM for Design and Manufacturing
Our evaluation’s primary goal is to explore the potential and challenges of integrating contemporary LLMs into the CAx workflow. Recognizing that every component of the CAx workflow—from design and design spaces to manufacturing instructions, performance metrics, and technical documentation—can be represented as compact programs, we employ LLMs across these diverse tasks. We conceptualize each phase of the CAx workflow as a translation layer, transforming input (which may be natural language or a domain-specific language, DSL for short) into an output DSL. Given LLMs’ proficiency in symbolic manipulations, they show promise in tackling these tasks, potentially enhancing our traditional methods. Our investigation covers each stage of the design and manufacturing process, including design generation from natural language, design space creation, design for manufacturing, performance prediction, and inverse design.
2.1 Generating Designs from Natural Language Specifications
We evaluate the ability of LLMs to create designs from natural language instructions, spanning various design contexts such as individual parts, hierarchical assemblies, and hybrid designs that integrate existing components. The LLM demonstrates proficiency in generating designs from high-level textual input, effectively handling a diverse range of representations and problem domains. For instance, we assess the LLM’s capability using OpenJSCAD,undefined an open source JavaScript-based CAD library. In every scenario, the LLM successfully generates coherent and well-structured code, complete with semantically meaningful variables and comments. It also shows skill in interpreting and completing underspecified prompts by inferring and supplying plausible values for missing parameters. An example of this is asking the LLM to design a simple cabinet with a shelf, as depicted in (ref?). While conducting our experiments, we note some common challenges the LLM faces in generating designs in OpenJSCAD. However, it effectively rectifies most of these issues upon user guidance.

To test the limits of LLM-driven design, we asked the LLM to design a functional quadcopter, which requires the integration of prebuilt elements such as motors, propellers, and batteries. After sourcing these components, the challenge lies in designing a frame that accommodates their dimensions. We assigned this task to an LLM, focusing primarily on the frame design to securely hold the selected components. The initial frame design proposed by the LLM was impractical, as it was directly attached to the motor cylinder and lacked adequate support for components like the battery, controller, and signal receiver. With minor adjustments, the LLM refined the design to a practical version. This final design was then subjected to further evaluation in a simulator or under real-world conditions, as illustrated in (ref?).

Although the LLM sometimes produces flawed initial designs, it effectively corrects errors after a few user interactions. This capability for iterative design is particularly beneficial for constructing complex structures. Users can begin with a simple prompt and gradually increase complexity to achieve the desired outcome. The LLM excels in incorporating modules and hierarchical structures through natural language, showcasing its skill in managing high-level structure and discrete composition. It reliably creates accurate primitives in various design languages. Additionally, the LLM’s output is characterized by its readability and maintainability. The generated code features descriptive variable names, helpful comments, and proper modularity. Users also have the option to request custom hierarchical refactoring, which aligns well with the preferences of human programmers and designers.
2.2 Design Spaces via Natural Language and Extrapolation
A design space encompasses a range of potential designs. A common approach to creating a design space is through parametric designs, where design parameters may be continuous or discrete. However, having a parametric design alone is insufficient for exploring different design variations, either manually or automatically. It is essential to determine specific values for the design parameters. This can be achieved by setting lower and upper bounds for each parameter, allowing each to take any value within these limits. A design space, therefore, is defined by the parametric design combined with these parameter bounds, representing the entire set of possible design variations.
The initial step in creating a design space is to verify the LLM’s ability to generate parametric designs. To ensure the generation of good parametric designs, we instruct the LLM to use high-level design parameters and minimize the number of variables. Interestingly, the LLM often introduces additional variables for readability on its own initiative, even without specific instructions. We have observed that incorporating these guidelines into our prompts consistently leads to the creation of parametric designs. Alongside this, establishing parameter bounds is crucial for defining a design space. When prompted for lower and upper bounds, the LLM suggests values based on the typical proportions of the object being designed. Although the absolute scale is arbitrary, the proposed bounds are semantically reasonable and proportionate to each other.
We also explored the LLM's capacity to infer a parametric design space from a specific preexisting design provided by the user. The input designs for the LLM are provided in a text-based DSL such as OpenJSCAD, with varying degrees of semantic annotations and explanatory comments. Enriching LLM inputs with more semantic information noticeably enhances the quality of the resulting design space. For instance, including the name of the object being modeled in the design proves beneficial for generating a parametric design. This approach reveals design parameters that are semantically more relevant for altering the design, as illustrated in (ref?). We find that the LLM’s extensive semantic knowledge base can be utilized to generate parameters, bounds, and constraints not only for text-based designs but also for preexisting designs, as shown in (ref?).


Design spaces derived from a single design enable the exploration of various shapes through parameter adjustment. However, for more substantial structural modifications, designers might consider combining elements from two or more designs within the same object class. This process of design interpolation presents its own challenges. To test the LLM’s capabilities in this area, we presented it with designs of a bicycle and a quad bike, which differ in wheel number and fork construction. The bicycle has a fork encasing the wheel, while the quad bike’s wheels attach to the frame with horizontal bars. Tasked with designing a tricycle, the LLM successfully determined the appropriate wheel arrangement and adapted the quad bike’s frame for wheel alignment. However, it did not fully incorporate the bicycle’s fork design, suggesting room for improvement with additional user guidance ((ref?)). This experiment highlights the LLM’s potential in merging design concepts and coding proficiency. We observe that the LLM can interpolate existing designs by extracting and adapting subdesigns based on their program representations. Interestingly, even when designs are not presented in a modular fashion, it tries to recognize and abstract submodules in input designs.
