This project is a compulsory first part of the examination for the System-on-Chip Design course at the University of Twente. The goals of this project are:
The description below refers to various file names. These files are not available on-line. Once you have logged in, execute the command
get-module syn syn
to get them in a subdirectory syn from which you have to work for this project.
Synopsys is a major developer of computer-aided design tools for logic synthesis (and many other tools). The University of Twente has access to these tools through its membership of Europractice, an initiative of the European Union that provides IC design facilities to universities and research institutes.
The tool that is relevant for this project, is the Design Compiler. It is a very sophisticated tool. For the purpose of the SoC Design course, a batch interface has been developed that hides the tool control from the user. The user only needs to declare a few essential issues such as the files that describe the design (see later on).
The cell library that is used in this project, is a 0.18 micron CMOS library of UMC. The university has access to this library through its Europractice membership. As a non-disclosure agreement was signed with Europractice, students are not supposed to disclose any information on this library other than what is already publicly accessible through the Internet.
The UMC180 datasheet should be opened from the command line with:
xreader /remote/labware/technology/UMC/UMCL18U250D2_2.4/datasheets/umcl18u250t2_databook.pdf &
A Unix (Bourne/bash) shell script called generate-design is available for systematically synthesizing all required designs. It deposits its output files in a subdirectory called synopsys_out.
The script first assigns values to a number of shell variables and then takes these values to perform synthesis. From the variable settings, an instance name stored in variable INSTANCE is derived. The instance name is used in many places:
The script generates a log file for each design that has been synthesized. You have to study this file to collect data on area and timing. Warnings and error messages generated during synthesis can also be found in this file. Study them carefully as they contain information on possible problems with your design.
The script generate-design can only be executed by submitting it to a queue for batch processing. This is done as follows:
If your request to execute the script cannot be honored directly, a message will be displayed that your request has been queued. When processing is ready, your shell's prompt will be displayed. You can see the status of the batch queue by means of the command:
squeueIf you want to remove a job from the queue, issue command:
scancel job numberusing the job number that is displayed when you submit a job (and also displayed when listing the queue).
If you have not done so, please study first the relevant parts of the document:
VHDL for Synthesis and Simulation (especially Sections 9, 11, and 12).
Verify that the following files are present in the syn directory from which you will work:
Make sure that the settings in the file generate-design are correct for synthesizing the copy architecture for the siso_gen entity.
Submit the script into the batch queue. When it is ready, study both the log file and the VHDL files produced. Due to the simplicity of the design, the hierarchical and flattened netlists are identical. Warning messages related to the unconnected scan ports can be ignored. These ports will be used in the second module of this course.
Which standard cells do you find in the synthesized design? For each type of cell, look up its functionality in the data book. How many flip-flops does the synthesized netlist contain? For each flipflop, explain to which element in the original VHDL source it corresponds.
Make sure that the file modelsim.ini is present in the current directory and then add the following files to a new Modelsim project for this exercise:
Perform a pre-synthesis simulation to verify that a 16-bit GCD is correctly computed. What is the name of the configuration to be simulated?
Modify the generate-design script for synthesizing the gcd architecture of the siso_gen circuit. For this exercise, it is sufficient that you synthesize the design for one clock period, say 5 ns. Examine the resulting flattened VHDL netlist. What are the names of the entity and architecture contained in the file? Consult the warnings and error messages in the generated log file and make sure that the entire synthesis process was successful (the slack in the timing report at the end should be, for example, positive).
As opposed to the copy architecture, the gcd architecture contains arithmetic functions. The reference report before flattening and the resources report in the log file contain information the implementation of arithmetic functions. Which arithmetic functions are used in the gcd architecture and which of those have been separately implemented by Synopsys? How many instances are there of each function? Explain the number of arithmetic functions from the VHDL source file.
Compile the flattened VHDL description. This description will be called the post-synthesis description of the VHDL. Before you can simulate this description, you should create a new configuration of the testbench that uses your post-synthesis model as the design under verification. In that new configuration, assign an appropriate value to the generic half_clock_period to match the clock frequency for which the design was synthesized, in this case 2500 ps.
Simulate the design. Make sure that the SDF file is taken into account (see the section on gate-level simulations in the Concise Manual for the Modelsim/Questasim Simulator). If everything went well, the waveforms that you observe should be (almost) identical to those of the pre-synthesis simulation. There are some minor differences, though: there will be a delay between a clock edge and a signal transition that is supposed to take place at the clock edge. Show this effect by selecting relevant signals in Modelsim's Wave window and zooming to the right time scale. Print the resulting waveform plot. Note: Modelsim's time resolution should be set to picoseconds to be able to see the effect. In case of troubles, make sure that the variable 'Resolution' (with either a lower-case or upper-case 'r') has value '1ps' in the "mpf" file, the file that describes your Modelsim project. If you modify this file, you will need to restart vsim.
You have already seen that one of the parameters to be set for logic synthesis is the clock period. Given a clock period, the synthesis tool tries to manipulate the longest combinational path (starting and ending in a register) to fit the given clock period. Such an approach to synthesis is called time-constrained synthesis. The supplied value for the clock period is first reduced to accommodate for the setup and hold times of the registers.
Synopsys operates approximately as follows. It first estimates the time available for complex functions such as arithmetic operators. It has a library of alternative implementations for these functions (think e.g. of different structures for adders; the faster structures require more hardware). The library is called Design Ware (DW, the abbreviation shows up in the hierarchical VHDL that is generated by Synopsys). After having composed all necessary logic with cells from the target standard-cell library, it checks whether the timing constraints are met. If this is the case, it stops. If the constraints are violated, it starts manipulating the logic in order to shorten the critical paths. This results in more area in general. This type of manipulation of the logic is an iterative process. It stops either as soon as the timing constraints have been met or when Synopsys judges that all possibilities to decrease the critical paths have been exhausted.
The log files contain the outputs of the Synopsys report_reference (the cells used, their areas and the total area) and report_timing (critical path) commands. The term slack refers to difference between the critical-path length and the timing constraint. A negative slack means that the timing constraints have not been met. Because Synopsys tries all kind of transformations and then gives up when it does not find a solution, the area figures reported for a solution with violated slack are not reliable. They should not be taken into account in the analyses.
In order to explore the design space for a single VHDL description of some circuit, one can synthesize it for different clock periods and then collect the time-area pairs reported by Synopsys (and recorded in the log file by the script generate-design). Solutions for which the timing constraint was not met, should be ignored.
The procedure just mentioned has been applied to a 16-bit version of the gcd architecture as presented earlier. It was also applied to a more clever design of 16-bit GCD algorithm that is not disclosed to you. The results, in tabular and graphic form, can be found in the Excel sheet gcd22.xls available directly under this project's description under the Modules section of Canvas.
It contains the time-area pairs that are meaningful. There are a few points to take into account:
Each solution to the design problem that is considered here, has two figures of merit to be minimized: time and area. Such a solution will be denoted by (t, a). By varying either the input VHDL architecture or the clock period constraint, one gets different solutions. Because of the two figures of merit, it is not possible to decide which of the solutions is the best one. However, it is possible to make a statement about one solution being inferior to another one. A solution (t1, a1) is said to be inferior to (t2, a2) if t1 is larger than t2 and a1 is larger than a2.
A solution that is not inferior to any other solution in the solution set is called to be Pareto optimal. There may be multiple solutions that are Pareto optimal in a solution set.
As you should know by now, only a subset of VHDL represents hardware that can be synthesized. If your VHDL code contains constructs that cannot be synthesized, the Design Compiler will generate error messages that will be included in the log file. It can also happen that the code is synthesized wrongly. The introduction of latches for combinational logic described by an if statement of which the else branch was forgotten is an example of such behavior. The latch is not intended, but the tools will insert it because the absence of the else branch implies that a signal does not change and hence needs to be stored. By the way, also situations exist when a latch is intended; such situations are outside the scope of this course.
The synthesizable subset of VHDL still allows for a large variety of hardware to be designed. One may want to constrain the class of hardware to be synthesized more. The design guidelines for the hardware to be designed in the System-on-Chip Design course are e.g.:
All code that is distributed for the exercises (except for testbench code), complies to these guidelines.
You have already synthesized the 16-bit gcd architecture for siso_gen for a timing constraint of 5ns. Repeat the synthesis with constraints of 4, 3.5, 3, 2.5, 2 and 1.5 ns. The results that you obtain should be identical to the first results given in file gcd22.xls. Compare the log files of the synthesis runs and pay special attention to the reports about the hierarchical versions of the solutions. Note that a dot in the time constraint is translated into a "p" in the instance name.
Try to identify a point of improvement in the given VHDL code for the gcd architecture. The block diagram as given under VHD-2 of Project VHD is available to you to help you in proposing an improvement. Create a new architecture (copy the original one and rename the architecture) that implements the improvement. Motivate your design choice. Note: the new design should not use more clock cycles!
Verify your pre-synthesis VHDL by simulation.
Synthesize your solution first for one value of the clock period and then simulate the post-synthesis VHDL. Compare the area numbers with the results of SYN-2. It is likely that the area for arithmetic blocks has decreased. Is there any increase in the rest of the hardware? Explain.
If no problems are encountered, synthesize your code for all clock-period values between 1.5 ns and 4 ns and extract the relevant time-area pairs for each constraint.
Add these data to the Excel sheet gcd22.xlsx and include them in time-area plot given in the Excel sheet. Clearly indicate the Pareto-optimal points in the sheet.
Evaluate your results. Comment on the improvements that you have achieved (or failed to achieve).
Note: If you do not succeed to find an architecture that improves the design, just take an architecture that is different than the one provided. If time permits, feel free to experiment with multiple alternative solutions. However, only include a single solution in your report.
Use the Canvas "file upload" feature to upload the following items:
|Go (back) to||Sabih's Home Page.|