Advanced Systems Programming - Lesson 15: Access to video memory

How much VRAM is needed? • This depends on (1) the total number of pixels, and on (2) the number of bits-per-pixel • The total number of pixels is determined by the screen’s width and height (measured in pixels) • Example: when our “screen-resolution” is set to 1280-by-960, we are seeing 1,228,800 pixels • The number of bits-per-pixel (“color depth”) is a programmable parameter (varies from 1 to 32) • Certain types of applications also need to use extra VRAM (for multiple displays, or for “special effects” like computer game animations)

pdf32 trang | Chia sẻ: candy98 | Lượt xem: 969 | Lượt tải: 0download
Bạn đang xem trước 20 trang tài liệu Advanced Systems Programming - Lesson 15: Access to video memory, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
Access to video memory We create a Linux device-driver that gives applications access to our graphics frame-buffer The role of a device-driver user application standard “runtime” libraries call ret user space kernel space Operating System kernel syscall sysret device-driver module call ret hardware device out in i/o memory RAM A device-driver is a software module that controls a hardware device in response to OS kernel requests relayed, often, from an application Raster Display Technology The graphics screen is a two-dimensional array of picture elements (‘pixels’) Each pixel’s color is an individually programmable mix of red, green, and blue These pixels are redrawn sequentially, left-to-right, by rows from top to bottom Special “dual-ported” memory VRAM RAM CPU CRT 16-MB of VRAM 2048-MB of RAM How much VRAM is needed? • This depends on (1) the total number of pixels, and on (2) the number of bits-per-pixel • The total number of pixels is determined by the screen’s width and height (measured in pixels) • Example: when our “screen-resolution” is set to 1280-by-960, we are seeing 1,228,800 pixels • The number of bits-per-pixel (“color depth”) is a programmable parameter (varies from 1 to 32) • Certain types of applications also need to use extra VRAM (for multiple displays, or for “special effects” like computer game animations) How ‘truecolor’ works R B G alpha red green blue 081624 pixel longword The intensity of each color-component within a pixel is an 8-bit value x86 uses “little-endian” order B G R B G R B G RVRAM 0 1 2 3 Video Screen 4 5 6 7 8 9 10 “truecolor” graphics-modes use 4-bytes per picture-element Some operating system issues • Linux is a “protected-mode” operating system • I/O devices normally are not directly accessible • Linux on x86 platforms uses “virtual memory” • Privileged software must “map” the VRAM • A device-driver module is needed: ‘vram.c’ • We can compile it using: $ mmake vram • Device-node: # mknod /dev/vram c 98 0 • Make it ‘writable’: # chmod a+w /dev/vram Our ‘vram.c’ module • It’s a character-mode Linux device-driver • It implements four device-file ‘methods’: – ‘read()’: lets a program read from video memory – ‘write()’: lets a program write to video memory – ‘llseek()’: lets a program ‘move’ the file’s pointer – ‘mmap()’: lets a program ‘map’ vram to user-space • It also implements a pseudo-file that lets users view the RADEON X300 graphics controller’s PCI Configuration Space parameter-values: $ cat /proc/vram What is PCI? • It’s an acronym for “Peripheral Component Interconnect” and refers to a collection of industry standards for devices used in PCs • An Intel-sponsored initiative (from 1992-9) having several ambitious goals: • Reduce diversity inherent in legacy PC devices • Improve speed and efficiency of data-transfers • Eliminate (or reduce) platform dependencies • Simplify adding/removing peripheral adapters • Lower PC’s total consumption of electrical power PCI Configuration Space PCI Configuration Space Body (48 doublewords – variable format) 64 doublewords PCI Configuration Space Header (16 doublewords – fixed format) A non-volatile parameter-storage area for each PCI device-function Example: Header Type 0 Status Register Command Register Device ID Vendor ID BIST Cache Line Size Class Code Class/SubClass/ProgIF Revision ID Base Address 0 Subsystem Device ID Subsystem Vendor ID CardBus CIS Pointer reserved capabilities pointer Expansion ROM Base Address Minimum Grant Interrupt Pin reserved Latency Timer Header Type Base Address 1 Base Address 2Base Address 3 Base Address 4Base Address 5 Interrupt Line Maximum Latency 31 031 0 16 doublewords Dwords 1 - 0 3 - 2 5 - 4 7 - 6 9 - 8 11 - 10 13 - 12 15 - 14 Examples of VENDOR-IDs • 0x8086 – Intel Corporation • 0x1022 – Advanced Micro Devices, Inc • 0x1002 – Advanced Technologies, Inc • 0x10EC – RealTek, Incorporated • 0x10DE – Nvidia Corporation • 0x10B7 – 3Com Corporation • 0x101C – Western Digital, Inc • 0x1014 – IBM Corporation • 0x0E11 – Compaq Corporation • 0x1057 – Motorola Corporation • 0x106B – Apple Computers, Inc • 0x5333 – Silicon Integrated Systems, Inc Examples of DEVICE-IDs • 0x5347: ATI RAGE128 SG • 0x4C58: ATI RADEON LX • 0x5950: ATI RS480 • 0x436E: ATI IXP300 SATA • 0x438C: ATI IXP600 IDE • 0x5B60: ATI Radeon X300 See this Linux header-file for lots more examples: Defined PCI Class Codes • 0x00: Legacy Device (i.e., built before class-codes were defined) • 0x01: Mass Storage controller • 0x02: Network controller • 0x03: Display controller • 0x04: Multimedia device • 0x05: Memory Controller • 0x06: Bridge device • 0x07: Simple Communications controller • 0x08: Base System peripherals • 0x09: Input device • 0x0A: Docking stations • 0x0B: Processors • 0x0C: Serial Bus controllers • 0x0D: Wireless controllers • 0x0E: Intelligent I/O controllers • 0x0F: Encryption/Decryption controllers • 0x10: Satellite Communications controllers • 0x11: Data Acquisition and Signal Processing controllers Example of Sub-Class Codes • Class Code 0x01: Mass Storage controller – 0x00: SCSI controller – 0x01: IDE controller – 0x02: Floppy Disk controller – 0x03: IPI controller – 0x04: RAID controller – 0x80: Other Mass Storage controller Example of Sub-Class Codes • Class Code 0x02: Network controller – 0x00: Ethernet controller – 0x01: Token Ring controller – 0x02: FDDI controller – 0x03: ATM controller – 0x04: ISDN controller – 0x80: Other Network controller Example of Sub-Class codes • Class Code 0x03: Display Controller – 0x00: VGA-compatible controller – 0x01: XGA controller – 0x02: 3D controller – 0x80: Other display controller Hardware details may differ • Graphics controllers use vendor-specific mechanisms to perform similar operations • There’s a common core of compatibility with IBM’s VGA (Video Graphics Array) developed in the mid-1980s, but since IBM’s loss of market dominance, each manufacturer has added enhancements which employ incompatible programming interfaces – you need a vendor’s manual! The ‘frame-buffer’ • Today’s PCI graphics systems all provide a dedicated amount of display memory to control the screen-image’s pixel-coloring • But how much memory will vary with price • And its location within the CPU’s physical address-space can’t be predicted because it depends upon what other PCI devices are installed (and mapped) during startup The ‘base address’ fields • The PCI Configuration Header has several so-called Base Addess fields, and vendors use one of these to hold the frame-buffer’s starting address and to indicate how much vram the video controller can actually use • The Linux kernel provides driver-writers with some convenient functions for getting the location and size of the frame-buffer Radeon uses Base Address 0 • Our ‘vram.c’ module’s initialization routine employs these kernel helper-functions: #include struct pci_dev *devp; // for a variable that will point to a kernel-structure // get a pointer to the PCI device’s Linux data-structure devp = pci_get_device( VENDOR_ID, DEVICE_ID, NULL ); if ( !devp ) return –ENODEV; // device is not present // get starting address and length for memory-resource 0 vram_base = pci_resource_start( devp, 0 ); vram_size = pci_resource_len( devp, 0 ); Reading from ‘vram’ • You can use our ‘fileview’ utility to see the current contents of the video frame-buffer $ fileview /dev/vram • Our ‘vram.c’ driver’s ‘read()’ method gets invoked when an application-program attempts to ‘read’ from the ‘/dev/vram’ device-file • The read-method is implemented by our driver using ‘ioremap()’ (and ’iounmap()’) to temporarily map a 4KB-page of physical vram to the kernel’s virtual address-space I/O ‘memcpy()’ functions • Linux provides a ‘platform-independent’ way to do copying from an i/o-device’s memory into an application’s buffer (or vice-versa): – A ‘read’ copies from vram to a user’s buffer memcpy_fromio( buf, vaddr, len ); – A ‘write’ copies to vram from a user’s buffer memcpy_toio( vaddr, buf, len ); ‘mmap()’ • This is a standard UNIX system-call that lets an application ‘map’ a file into its virtual address-space, where it can then treat the file as if it were an ordinary array • See the man-page: $ man mmap • This same system-call can also work on a device-file if that device’s driver provided ‘mmap()’ among its file-operations The user-role • In the application-program, six arguments get passed to the ‘mmap()’ library-function int mmap( (void*)baseaddress, int memorysize, int accessattributes, int flags, int filehandle, int offset ); The driver-role • In the kernel, those six arguments will get validated and processed, then the driver’s ‘mmap()’ callback-function will be invoked to supply missing information and perform further sanity-checks and do appropriate page-mapping actions: int mmap( struct file *file, struct vm_area_struct *vma ); Our driver’s code int mmap( struct file *file, struct vm_area_ struct *vma ) { // extract the paramers we will need from the ‘vm_area_struct’ unsigned long region_length = vma->vm_end – vma->vm_start; unsigned long region_origin = vma->vm_pgoff * PAGE_SIZE; unsigned long physical_addr = fb_base + region_origin; unsigned long user_virtaddr = vma->vm_start; // sanity check: mapped region cannot extend past end of vram if ( region_origin + region_length > fb_size ) return –EINVAL; // tell the kernel not to try ‘swapping out’ this region to the disk vma->vm_flags |= VM_RESERVED; // tell the kernel to exclude this region from any core dumps vma->vm_flags |= VM_IO; Driver’s code continued // invoke a helper-function that will set up the page-table entries if ( remap_pfn_range( vma, user_virtaddr, physical_addr >> 12, region_length, vma->vm_page_prot ) ) return –EAGAIN; return 0; // SUCCESS } Demo: ‘rotation.cpp’ • This application-program will demonstrate use of our ‘vram.c’ device-driver’s ‘read()’, ‘write()’ and ‘llseek()’ methods (i.e., device-file operations) • It will perform a rotation of the color-components (R,G,B) in every displayed ‘truecolor’ pixel: R  G G  B B  R • After 3 times the screen will look normal again Demo: ‘inherit.cpp’ • This application-program will demonstrate use of the ‘mmap()’ method in our driver, and the fact that memory-mappings which a parent-process creates will be ‘inherited’ by a ‘child-process’ • You will see a rectangular purple border drawn on your display -- provided the program-parameters match your screen In-class exercise • Can you adapt the ideas in ‘inherit.cpp’ to create a program (named ‘backward.cpp’) that will reverse the ordering of the pixels in each screen-row?