Merge tag 'misc-habanalabs-next-2021-01-27' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into char-misc-next

+4 -4

CREDITS

··· 1244 1244 S: Brazil 1245 1245 1246 1246 N: Oded Gabbay 1247 - E: oded.gabbay@gmail.com 1248 - D: HabanaLabs and AMD KFD maintainer 1249 - S: 12 Shraga Raphaeli 1250 - S: Petah-Tikva, 4906418 1247 + E: ogabbay@kernel.org 1248 + D: HabanaLabs maintainer 1249 + S: 29 Duchifat St. 1250 + S: Ra'anana 4372029 1251 1251 S: Israel 1252 1252 1253 1253 N: Kumar Gala

+29 -21

Documentation/ABI/testing/debugfs-driver-habanalabs

··· 1 1 What: /sys/kernel/debug/habanalabs/hl<n>/addr 2 2 Date: Jan 2019 3 3 KernelVersion: 5.1 4 - Contact: oded.gabbay@gmail.com 4 + Contact: ogabbay@kernel.org 5 5 Description: Sets the device address to be used for read or write through 6 6 PCI bar, or the device VA of a host mapped memory to be read or 7 7 written directly from the host. The latter option is allowed ··· 11 11 What: /sys/kernel/debug/habanalabs/hl<n>/clk_gate 12 12 Date: May 2020 13 13 KernelVersion: 5.8 14 - Contact: oded.gabbay@gmail.com 14 + Contact: ogabbay@kernel.org 15 15 Description: Allow the root user to disable/enable in runtime the clock 16 16 gating mechanism in Gaudi. Due to how Gaudi is built, the 17 17 clock gating needs to be disabled in order to access the ··· 34 34 What: /sys/kernel/debug/habanalabs/hl<n>/command_buffers 35 35 Date: Jan 2019 36 36 KernelVersion: 5.1 37 - Contact: oded.gabbay@gmail.com 37 + Contact: ogabbay@kernel.org 38 38 Description: Displays a list with information about the currently allocated 39 39 command buffers 40 40 41 41 What: /sys/kernel/debug/habanalabs/hl<n>/command_submission 42 42 Date: Jan 2019 43 43 KernelVersion: 5.1 44 - Contact: oded.gabbay@gmail.com 44 + Contact: ogabbay@kernel.org 45 45 Description: Displays a list with information about the currently active 46 46 command submissions 47 47 48 48 What: /sys/kernel/debug/habanalabs/hl<n>/command_submission_jobs 49 49 Date: Jan 2019 50 50 KernelVersion: 5.1 51 - Contact: oded.gabbay@gmail.com 51 + Contact: ogabbay@kernel.org 52 52 Description: Displays a list with detailed information about each JOB (CB) of 53 53 each active command submission 54 54 55 55 What: /sys/kernel/debug/habanalabs/hl<n>/data32 56 56 Date: Jan 2019 57 57 KernelVersion: 5.1 58 - Contact: oded.gabbay@gmail.com 58 + Contact: ogabbay@kernel.org 59 59 Description: Allows the root user to read or write directly through the 60 60 device's PCI bar. Writing to this file generates a write 61 61 transaction while reading from the file generates a read ··· 70 70 What: /sys/kernel/debug/habanalabs/hl<n>/data64 71 71 Date: Jan 2020 72 72 KernelVersion: 5.6 73 - Contact: oded.gabbay@gmail.com 73 + Contact: ogabbay@kernel.org 74 74 Description: Allows the root user to read or write 64 bit data directly 75 75 through the device's PCI bar. Writing to this file generates a 76 76 write transaction while reading from the file generates a read ··· 85 85 What: /sys/kernel/debug/habanalabs/hl<n>/device 86 86 Date: Jan 2019 87 87 KernelVersion: 5.1 88 - Contact: oded.gabbay@gmail.com 88 + Contact: ogabbay@kernel.org 89 89 Description: Enables the root user to set the device to specific state. 90 90 Valid values are "disable", "enable", "suspend", "resume". 91 91 User can read this property to see the valid values ··· 93 93 What: /sys/kernel/debug/habanalabs/hl<n>/engines 94 94 Date: Jul 2019 95 95 KernelVersion: 5.3 96 - Contact: oded.gabbay@gmail.com 96 + Contact: ogabbay@kernel.org 97 97 Description: Displays the status registers values of the device engines and 98 98 their derived idle status 99 99 100 100 What: /sys/kernel/debug/habanalabs/hl<n>/i2c_addr 101 101 Date: Jan 2019 102 102 KernelVersion: 5.1 103 - Contact: oded.gabbay@gmail.com 103 + Contact: ogabbay@kernel.org 104 104 Description: Sets I2C device address for I2C transaction that is generated 105 105 by the device's CPU 106 106 107 107 What: /sys/kernel/debug/habanalabs/hl<n>/i2c_bus 108 108 Date: Jan 2019 109 109 KernelVersion: 5.1 110 - Contact: oded.gabbay@gmail.com 110 + Contact: ogabbay@kernel.org 111 111 Description: Sets I2C bus address for I2C transaction that is generated by 112 112 the device's CPU 113 113 114 114 What: /sys/kernel/debug/habanalabs/hl<n>/i2c_data 115 115 Date: Jan 2019 116 116 KernelVersion: 5.1 117 - Contact: oded.gabbay@gmail.com 117 + Contact: ogabbay@kernel.org 118 118 Description: Triggers an I2C transaction that is generated by the device's 119 119 CPU. Writing to this file generates a write transaction while 120 120 reading from the file generates a read transcation ··· 122 122 What: /sys/kernel/debug/habanalabs/hl<n>/i2c_reg 123 123 Date: Jan 2019 124 124 KernelVersion: 5.1 125 - Contact: oded.gabbay@gmail.com 125 + Contact: ogabbay@kernel.org 126 126 Description: Sets I2C register id for I2C transaction that is generated by 127 127 the device's CPU 128 128 129 129 What: /sys/kernel/debug/habanalabs/hl<n>/led0 130 130 Date: Jan 2019 131 131 KernelVersion: 5.1 132 - Contact: oded.gabbay@gmail.com 132 + Contact: ogabbay@kernel.org 133 133 Description: Sets the state of the first S/W led on the device 134 134 135 135 What: /sys/kernel/debug/habanalabs/hl<n>/led1 136 136 Date: Jan 2019 137 137 KernelVersion: 5.1 138 - Contact: oded.gabbay@gmail.com 138 + Contact: ogabbay@kernel.org 139 139 Description: Sets the state of the second S/W led on the device 140 140 141 141 What: /sys/kernel/debug/habanalabs/hl<n>/led2 142 142 Date: Jan 2019 143 143 KernelVersion: 5.1 144 - Contact: oded.gabbay@gmail.com 144 + Contact: ogabbay@kernel.org 145 145 Description: Sets the state of the third S/W led on the device 146 146 147 147 What: /sys/kernel/debug/habanalabs/hl<n>/mmu 148 148 Date: Jan 2019 149 149 KernelVersion: 5.1 150 - Contact: oded.gabbay@gmail.com 150 + Contact: ogabbay@kernel.org 151 151 Description: Displays the hop values and physical address for a given ASID 152 152 and virtual address. The user should write the ASID and VA into 153 153 the file and then read the file to get the result. ··· 157 157 What: /sys/kernel/debug/habanalabs/hl<n>/set_power_state 158 158 Date: Jan 2019 159 159 KernelVersion: 5.1 160 - Contact: oded.gabbay@gmail.com 160 + Contact: ogabbay@kernel.org 161 161 Description: Sets the PCI power state. Valid values are "1" for D0 and "2" 162 162 for D3Hot 163 163 164 164 What: /sys/kernel/debug/habanalabs/hl<n>/userptr 165 165 Date: Jan 2019 166 166 KernelVersion: 5.1 167 - Contact: oded.gabbay@gmail.com 167 + Contact: ogabbay@kernel.org 168 168 Description: Displays a list with information about the currently user 169 169 pointers (user virtual addresses) that are pinned and mapped 170 170 to DMA addresses ··· 172 172 What: /sys/kernel/debug/habanalabs/hl<n>/vm 173 173 Date: Jan 2019 174 174 KernelVersion: 5.1 175 - Contact: oded.gabbay@gmail.com 175 + Contact: ogabbay@kernel.org 176 176 Description: Displays a list with information about all the active virtual 177 177 address mappings per ASID 178 178 179 179 What: /sys/kernel/debug/habanalabs/hl<n>/stop_on_err 180 180 Date: Mar 2020 181 181 KernelVersion: 5.6 182 - Contact: oded.gabbay@gmail.com 182 + Contact: ogabbay@kernel.org 183 183 Description: Sets the stop-on_error option for the device engines. Value of 184 184 "0" is for disable, otherwise enable. 185 + 186 + What: /sys/kernel/debug/habanalabs/hl<n>/dump_security_violations 187 + Date: Jan 2021 188 + KernelVersion: 5.12 189 + Contact: ogabbay@kernel.org 190 + Description: Dumps all security violations to dmesg. This will also ack 191 + all security violations meanings those violations will not be 192 + dumped next time user calls this API

+29 -29

Documentation/ABI/testing/sysfs-driver-habanalabs

··· 1 1 What: /sys/class/habanalabs/hl<n>/armcp_kernel_ver 2 2 Date: Jan 2019 3 3 KernelVersion: 5.1 4 - Contact: oded.gabbay@gmail.com 4 + Contact: ogabbay@kernel.org 5 5 Description: Version of the Linux kernel running on the device's CPU. 6 6 Will be DEPRECATED in Linux kernel version 5.10, and be 7 7 replaced with cpucp_kernel_ver ··· 9 9 What: /sys/class/habanalabs/hl<n>/armcp_ver 10 10 Date: Jan 2019 11 11 KernelVersion: 5.1 12 - Contact: oded.gabbay@gmail.com 12 + Contact: ogabbay@kernel.org 13 13 Description: Version of the application running on the device's CPU 14 14 Will be DEPRECATED in Linux kernel version 5.10, and be 15 15 replaced with cpucp_ver ··· 17 17 What: /sys/class/habanalabs/hl<n>/clk_max_freq_mhz 18 18 Date: Jun 2019 19 19 KernelVersion: not yet upstreamed 20 - Contact: oded.gabbay@gmail.com 20 + Contact: ogabbay@kernel.org 21 21 Description: Allows the user to set the maximum clock frequency, in MHz. 22 22 The device clock might be set to lower value than the maximum. 23 23 The user should read the clk_cur_freq_mhz to see the actual ··· 27 27 What: /sys/class/habanalabs/hl<n>/clk_cur_freq_mhz 28 28 Date: Jun 2019 29 29 KernelVersion: not yet upstreamed 30 - Contact: oded.gabbay@gmail.com 30 + Contact: ogabbay@kernel.org 31 31 Description: Displays the current frequency, in MHz, of the device clock. 32 32 This property is valid only for the Gaudi ASIC family 33 33 34 34 What: /sys/class/habanalabs/hl<n>/cpld_ver 35 35 Date: Jan 2019 36 36 KernelVersion: 5.1 37 - Contact: oded.gabbay@gmail.com 37 + Contact: ogabbay@kernel.org 38 38 Description: Version of the Device's CPLD F/W 39 39 40 40 What: /sys/class/habanalabs/hl<n>/cpucp_kernel_ver 41 41 Date: Oct 2020 42 42 KernelVersion: 5.10 43 - Contact: oded.gabbay@gmail.com 43 + Contact: ogabbay@kernel.org 44 44 Description: Version of the Linux kernel running on the device's CPU 45 45 46 46 What: /sys/class/habanalabs/hl<n>/cpucp_ver 47 47 Date: Oct 2020 48 48 KernelVersion: 5.10 49 - Contact: oded.gabbay@gmail.com 49 + Contact: ogabbay@kernel.org 50 50 Description: Version of the application running on the device's CPU 51 51 52 52 What: /sys/class/habanalabs/hl<n>/device_type 53 53 Date: Jan 2019 54 54 KernelVersion: 5.1 55 - Contact: oded.gabbay@gmail.com 55 + Contact: ogabbay@kernel.org 56 56 Description: Displays the code name of the device according to its type. 57 57 The supported values are: "GOYA" 58 58 59 59 What: /sys/class/habanalabs/hl<n>/eeprom 60 60 Date: Jan 2019 61 61 KernelVersion: 5.1 62 - Contact: oded.gabbay@gmail.com 62 + Contact: ogabbay@kernel.org 63 63 Description: A binary file attribute that contains the contents of the 64 64 on-board EEPROM 65 65 66 66 What: /sys/class/habanalabs/hl<n>/fuse_ver 67 67 Date: Jan 2019 68 68 KernelVersion: 5.1 69 - Contact: oded.gabbay@gmail.com 69 + Contact: ogabbay@kernel.org 70 70 Description: Displays the device's version from the eFuse 71 71 72 72 What: /sys/class/habanalabs/hl<n>/hard_reset 73 73 Date: Jan 2019 74 74 KernelVersion: 5.1 75 - Contact: oded.gabbay@gmail.com 75 + Contact: ogabbay@kernel.org 76 76 Description: Interface to trigger a hard-reset operation for the device. 77 77 Hard-reset will reset ALL internal components of the device 78 78 except for the PCI interface and the internal PLLs ··· 80 80 What: /sys/class/habanalabs/hl<n>/hard_reset_cnt 81 81 Date: Jan 2019 82 82 KernelVersion: 5.1 83 - Contact: oded.gabbay@gmail.com 83 + Contact: ogabbay@kernel.org 84 84 Description: Displays how many times the device have undergone a hard-reset 85 85 operation since the driver was loaded 86 86 87 87 What: /sys/class/habanalabs/hl<n>/high_pll 88 88 Date: Jan 2019 89 89 KernelVersion: 5.1 90 - Contact: oded.gabbay@gmail.com 90 + Contact: ogabbay@kernel.org 91 91 Description: Allows the user to set the maximum clock frequency for MME, TPC 92 92 and IC when the power management profile is set to "automatic". 93 93 This property is valid only for the Goya ASIC family ··· 95 95 What: /sys/class/habanalabs/hl<n>/ic_clk 96 96 Date: Jan 2019 97 97 KernelVersion: 5.1 98 - Contact: oded.gabbay@gmail.com 98 + Contact: ogabbay@kernel.org 99 99 Description: Allows the user to set the maximum clock frequency, in Hz, of 100 100 the Interconnect fabric. Writes to this parameter affect the 101 101 device only when the power management profile is set to "manual" ··· 107 107 What: /sys/class/habanalabs/hl<n>/ic_clk_curr 108 108 Date: Jan 2019 109 109 KernelVersion: 5.1 110 - Contact: oded.gabbay@gmail.com 110 + Contact: ogabbay@kernel.org 111 111 Description: Displays the current clock frequency, in Hz, of the Interconnect 112 112 fabric. This property is valid only for the Goya ASIC family 113 113 114 114 What: /sys/class/habanalabs/hl<n>/infineon_ver 115 115 Date: Jan 2019 116 116 KernelVersion: 5.1 117 - Contact: oded.gabbay@gmail.com 117 + Contact: ogabbay@kernel.org 118 118 Description: Version of the Device's power supply F/W code 119 119 120 120 What: /sys/class/habanalabs/hl<n>/max_power 121 121 Date: Jan 2019 122 122 KernelVersion: 5.1 123 - Contact: oded.gabbay@gmail.com 123 + Contact: ogabbay@kernel.org 124 124 Description: Allows the user to set the maximum power consumption of the 125 125 device in milliwatts. 126 126 127 127 What: /sys/class/habanalabs/hl<n>/mme_clk 128 128 Date: Jan 2019 129 129 KernelVersion: 5.1 130 - Contact: oded.gabbay@gmail.com 130 + Contact: ogabbay@kernel.org 131 131 Description: Allows the user to set the maximum clock frequency, in Hz, of 132 132 the MME compute engine. Writes to this parameter affect the 133 133 device only when the power management profile is set to "manual" ··· 139 139 What: /sys/class/habanalabs/hl<n>/mme_clk_curr 140 140 Date: Jan 2019 141 141 KernelVersion: 5.1 142 - Contact: oded.gabbay@gmail.com 142 + Contact: ogabbay@kernel.org 143 143 Description: Displays the current clock frequency, in Hz, of the MME compute 144 144 engine. This property is valid only for the Goya ASIC family 145 145 146 146 What: /sys/class/habanalabs/hl<n>/pci_addr 147 147 Date: Jan 2019 148 148 KernelVersion: 5.1 149 - Contact: oded.gabbay@gmail.com 149 + Contact: ogabbay@kernel.org 150 150 Description: Displays the PCI address of the device. This is needed so the 151 151 user would be able to open a device based on its PCI address 152 152 153 153 What: /sys/class/habanalabs/hl<n>/pm_mng_profile 154 154 Date: Jan 2019 155 155 KernelVersion: 5.1 156 - Contact: oded.gabbay@gmail.com 156 + Contact: ogabbay@kernel.org 157 157 Description: Power management profile. Values are "auto", "manual". In "auto" 158 158 mode, the driver will set the maximum clock frequency to a high 159 159 value when a user-space process opens the device's file (unless ··· 167 167 What: /sys/class/habanalabs/hl<n>/preboot_btl_ver 168 168 Date: Jan 2019 169 169 KernelVersion: 5.1 170 - Contact: oded.gabbay@gmail.com 170 + Contact: ogabbay@kernel.org 171 171 Description: Version of the device's preboot F/W code 172 172 173 173 What: /sys/class/habanalabs/hl<n>/soft_reset 174 174 Date: Jan 2019 175 175 KernelVersion: 5.1 176 - Contact: oded.gabbay@gmail.com 176 + Contact: ogabbay@kernel.org 177 177 Description: Interface to trigger a soft-reset operation for the device. 178 178 Soft-reset will reset only the compute and DMA engines of the 179 179 device ··· 181 181 What: /sys/class/habanalabs/hl<n>/soft_reset_cnt 182 182 Date: Jan 2019 183 183 KernelVersion: 5.1 184 - Contact: oded.gabbay@gmail.com 184 + Contact: ogabbay@kernel.org 185 185 Description: Displays how many times the device have undergone a soft-reset 186 186 operation since the driver was loaded 187 187 188 188 What: /sys/class/habanalabs/hl<n>/status 189 189 Date: Jan 2019 190 190 KernelVersion: 5.1 191 - Contact: oded.gabbay@gmail.com 191 + Contact: ogabbay@kernel.org 192 192 Description: Status of the card: "Operational", "Malfunction", "In reset". 193 193 194 194 What: /sys/class/habanalabs/hl<n>/thermal_ver 195 195 Date: Jan 2019 196 196 KernelVersion: 5.1 197 - Contact: oded.gabbay@gmail.com 197 + Contact: ogabbay@kernel.org 198 198 Description: Version of the Device's thermal daemon 199 199 200 200 What: /sys/class/habanalabs/hl<n>/tpc_clk 201 201 Date: Jan 2019 202 202 KernelVersion: 5.1 203 - Contact: oded.gabbay@gmail.com 203 + Contact: ogabbay@kernel.org 204 204 Description: Allows the user to set the maximum clock frequency, in Hz, of 205 205 the TPC compute engines. Writes to this parameter affect the 206 206 device only when the power management profile is set to "manual" ··· 212 212 What: /sys/class/habanalabs/hl<n>/tpc_clk_curr 213 213 Date: Jan 2019 214 214 KernelVersion: 5.1 215 - Contact: oded.gabbay@gmail.com 215 + Contact: ogabbay@kernel.org 216 216 Description: Displays the current clock frequency, in Hz, of the TPC compute 217 217 engines. This property is valid only for the Goya ASIC family 218 218 219 219 What: /sys/class/habanalabs/hl<n>/uboot_ver 220 220 Date: Jan 2019 221 221 KernelVersion: 5.1 222 - Contact: oded.gabbay@gmail.com 222 + Contact: ogabbay@kernel.org 223 223 Description: Version of the u-boot running on the device's CPU

+8 -2

drivers/misc/habanalabs/common/Makefile

··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 + 3 + include $(src)/common/mmu/Makefile 4 + habanalabs-y += $(HL_COMMON_MMU_FILES) 5 + 6 + include $(src)/common/pci/Makefile 7 + habanalabs-y += $(HL_COMMON_PCI_FILES) 8 + 2 9 HL_COMMON_FILES := common/habanalabs_drv.o common/device.o common/context.o \ 3 10 common/asid.o common/habanalabs_ioctl.o \ 4 11 common/command_buffer.o common/hw_queue.o common/irq.o \ 5 12 common/sysfs.o common/hwmon.o common/memory.o \ 6 - common/command_submission.o common/mmu.o common/mmu_v1.o \ 7 - common/firmware_if.o common/pci.o 13 + common/command_submission.o common/firmware_if.o

+4 -2

drivers/misc/habanalabs/common/asid.c

··· 50 50 51 51 void hl_asid_free(struct hl_device *hdev, unsigned long asid) 52 52 { 53 - if (WARN((asid == 0 || asid >= hdev->asic_prop.max_asid), 54 - "Invalid ASID %lu", asid)) 53 + if (asid == HL_KERNEL_ASID_ID || asid >= hdev->asic_prop.max_asid) { 54 + dev_crit(hdev->dev, "Invalid ASID %lu", asid); 55 55 return; 56 + } 57 + 56 58 clear_bit(asid, hdev->asid_bitmap); 57 59 }

+5 -3

drivers/misc/habanalabs/common/command_buffer.c

··· 635 635 636 636 cb_handle >>= PAGE_SHIFT; 637 637 cb = hl_cb_get(hdev, &hdev->kernel_cb_mgr, (u32) cb_handle); 638 - /* hl_cb_get should never fail here so use kernel WARN */ 639 - WARN(!cb, "Kernel CB handle invalid 0x%x\n", (u32) cb_handle); 640 - if (!cb) 638 + /* hl_cb_get should never fail here */ 639 + if (!cb) { 640 + dev_crit(hdev->dev, "Kernel CB handle invalid 0x%x\n", 641 + (u32) cb_handle); 641 642 goto destroy_cb; 643 + } 642 644 643 645 return cb; 644 646

+424 -51

drivers/misc/habanalabs/common/command_submission.c

··· 48 48 struct hl_device *hdev = hw_sob->hdev; 49 49 50 50 dev_crit(hdev->dev, 51 - "SOB release shouldn't be called here, q_idx: %d, sob_id: %d\n", 52 - hw_sob->q_idx, hw_sob->sob_id); 51 + "SOB release shouldn't be called here, q_idx: %d, sob_id: %d\n", 52 + hw_sob->q_idx, hw_sob->sob_id); 53 53 } 54 54 55 55 /** ··· 149 149 kref_get(&fence->refcount); 150 150 } 151 151 152 - static void hl_fence_init(struct hl_fence *fence) 152 + static void hl_fence_init(struct hl_fence *fence, u64 sequence) 153 153 { 154 154 kref_init(&fence->refcount); 155 + fence->cs_sequence = sequence; 155 156 fence->error = 0; 156 157 fence->timestamp = ktime_set(0, 0); 157 158 init_completion(&fence->completion); ··· 183 182 static void cs_job_put(struct hl_cs_job *job) 184 183 { 185 184 kref_put(&job->refcount, cs_job_do_release); 185 + } 186 + 187 + bool cs_needs_completion(struct hl_cs *cs) 188 + { 189 + /* In case this is a staged CS, only the last CS in sequence should 190 + * get a completion, any non staged CS will always get a completion 191 + */ 192 + if (cs->staged_cs && !cs->staged_last) 193 + return false; 194 + 195 + return true; 196 + } 197 + 198 + bool cs_needs_timeout(struct hl_cs *cs) 199 + { 200 + /* In case this is a staged CS, only the first CS in sequence should 201 + * get a timeout, any non staged CS will always get a timeout 202 + */ 203 + if (cs->staged_cs && !cs->staged_first) 204 + return false; 205 + 206 + return true; 186 207 } 187 208 188 209 static bool is_cb_patched(struct hl_device *hdev, struct hl_cs_job *job) ··· 248 225 parser.queue_type = job->queue_type; 249 226 parser.is_kernel_allocated_cb = job->is_kernel_allocated_cb; 250 227 job->patched_cb = NULL; 228 + parser.completion = cs_needs_completion(job->cs); 251 229 252 230 rc = hdev->asic_funcs->cs_parser(hdev, &parser); 253 231 ··· 314 290 315 291 hl_debugfs_remove_job(hdev, job); 316 292 317 - if (job->queue_type == QUEUE_TYPE_EXT || 318 - job->queue_type == QUEUE_TYPE_HW) 293 + /* We decrement reference only for a CS that gets completion 294 + * because the reference was incremented only for this kind of CS 295 + * right before it was scheduled. 296 + * 297 + * In staged submission, only the last CS marked as 'staged_last' 298 + * gets completion, hence its release function will be called from here. 299 + * As for all the rest CS's in the staged submission which do not get 300 + * completion, their CS reference will be decremented by the 301 + * 'staged_last' CS during the CS release flow. 302 + * All relevant PQ CI counters will be incremented during the CS release 303 + * flow by calling 'hl_hw_queue_update_ci'. 304 + */ 305 + if (cs_needs_completion(cs) && 306 + (job->queue_type == QUEUE_TYPE_EXT || 307 + job->queue_type == QUEUE_TYPE_HW)) 319 308 cs_put(cs); 320 309 321 310 cs_job_put(job); 311 + } 312 + 313 + /* 314 + * hl_staged_cs_find_first - locate the first CS in this staged submission 315 + * 316 + * @hdev: pointer to device structure 317 + * @cs_seq: staged submission sequence number 318 + * 319 + * @note: This function must be called under 'hdev->cs_mirror_lock' 320 + * 321 + * Find and return a CS pointer with the given sequence 322 + */ 323 + struct hl_cs *hl_staged_cs_find_first(struct hl_device *hdev, u64 cs_seq) 324 + { 325 + struct hl_cs *cs; 326 + 327 + list_for_each_entry_reverse(cs, &hdev->cs_mirror_list, mirror_node) 328 + if (cs->staged_cs && cs->staged_first && 329 + cs->sequence == cs_seq) 330 + return cs; 331 + 332 + return NULL; 333 + } 334 + 335 + /* 336 + * is_staged_cs_last_exists - returns true if the last CS in sequence exists 337 + * 338 + * @hdev: pointer to device structure 339 + * @cs: staged submission member 340 + * 341 + */ 342 + bool is_staged_cs_last_exists(struct hl_device *hdev, struct hl_cs *cs) 343 + { 344 + struct hl_cs *last_entry; 345 + 346 + last_entry = list_last_entry(&cs->staged_cs_node, struct hl_cs, 347 + staged_cs_node); 348 + 349 + if (last_entry->staged_last) 350 + return true; 351 + 352 + return false; 353 + } 354 + 355 + /* 356 + * staged_cs_get - get CS reference if this CS is a part of a staged CS 357 + * 358 + * @hdev: pointer to device structure 359 + * @cs: current CS 360 + * @cs_seq: staged submission sequence number 361 + * 362 + * Increment CS reference for every CS in this staged submission except for 363 + * the CS which get completion. 364 + */ 365 + static void staged_cs_get(struct hl_device *hdev, struct hl_cs *cs) 366 + { 367 + /* Only the last CS in this staged submission will get a completion. 368 + * We must increment the reference for all other CS's in this 369 + * staged submission. 370 + * Once we get a completion we will release the whole staged submission. 371 + */ 372 + if (!cs->staged_last) 373 + cs_get(cs); 374 + } 375 + 376 + /* 377 + * staged_cs_put - put a CS in case it is part of staged submission 378 + * 379 + * @hdev: pointer to device structure 380 + * @cs: CS to put 381 + * 382 + * This function decrements a CS reference (for a non completion CS) 383 + */ 384 + static void staged_cs_put(struct hl_device *hdev, struct hl_cs *cs) 385 + { 386 + /* We release all CS's in a staged submission except the last 387 + * CS which we have never incremented its reference. 388 + */ 389 + if (!cs_needs_completion(cs)) 390 + cs_put(cs); 391 + } 392 + 393 + static void cs_handle_tdr(struct hl_device *hdev, struct hl_cs *cs) 394 + { 395 + bool next_entry_found = false; 396 + struct hl_cs *next; 397 + 398 + if (!cs_needs_timeout(cs)) 399 + return; 400 + 401 + spin_lock(&hdev->cs_mirror_lock); 402 + 403 + /* We need to handle tdr only once for the complete staged submission. 404 + * Hence, we choose the CS that reaches this function first which is 405 + * the CS marked as 'staged_last'. 406 + */ 407 + if (cs->staged_cs && cs->staged_last) 408 + cs = hl_staged_cs_find_first(hdev, cs->staged_sequence); 409 + 410 + spin_unlock(&hdev->cs_mirror_lock); 411 + 412 + /* Don't cancel TDR in case this CS was timedout because we might be 413 + * running from the TDR context 414 + */ 415 + if (cs && (cs->timedout || 416 + hdev->timeout_jiffies == MAX_SCHEDULE_TIMEOUT)) 417 + return; 418 + 419 + if (cs && cs->tdr_active) 420 + cancel_delayed_work_sync(&cs->work_tdr); 421 + 422 + spin_lock(&hdev->cs_mirror_lock); 423 + 424 + /* queue TDR for next CS */ 425 + list_for_each_entry(next, &hdev->cs_mirror_list, mirror_node) 426 + if (cs_needs_timeout(next)) { 427 + next_entry_found = true; 428 + break; 429 + } 430 + 431 + if (next_entry_found && !next->tdr_active) { 432 + next->tdr_active = true; 433 + schedule_delayed_work(&next->work_tdr, 434 + hdev->timeout_jiffies); 435 + } 436 + 437 + spin_unlock(&hdev->cs_mirror_lock); 322 438 } 323 439 324 440 static void cs_do_release(struct kref *ref) ··· 510 346 511 347 hdev->asic_funcs->hw_queues_unlock(hdev); 512 348 513 - /* Need to update CI for internal queues */ 514 - hl_int_hw_queue_update_ci(cs); 349 + /* Need to update CI for all queue jobs that does not get completion */ 350 + hl_hw_queue_update_ci(cs); 515 351 516 352 /* remove CS from CS mirror list */ 517 353 spin_lock(&hdev->cs_mirror_lock); 518 354 list_del_init(&cs->mirror_node); 519 355 spin_unlock(&hdev->cs_mirror_lock); 520 356 521 - /* Don't cancel TDR in case this CS was timedout because we might be 522 - * running from the TDR context 523 - */ 524 - if (!cs->timedout && hdev->timeout_jiffies != MAX_SCHEDULE_TIMEOUT) { 525 - struct hl_cs *next; 357 + cs_handle_tdr(hdev, cs); 526 358 527 - if (cs->tdr_active) 528 - cancel_delayed_work_sync(&cs->work_tdr); 359 + if (cs->staged_cs) { 360 + /* the completion CS decrements reference for the entire 361 + * staged submission 362 + */ 363 + if (cs->staged_last) { 364 + struct hl_cs *staged_cs, *tmp; 529 365 530 - spin_lock(&hdev->cs_mirror_lock); 531 - 532 - /* queue TDR for next CS */ 533 - next = list_first_entry_or_null(&hdev->cs_mirror_list, 534 - struct hl_cs, mirror_node); 535 - 536 - if (next && !next->tdr_active) { 537 - next->tdr_active = true; 538 - schedule_delayed_work(&next->work_tdr, 539 - hdev->timeout_jiffies); 366 + list_for_each_entry_safe(staged_cs, tmp, 367 + &cs->staged_cs_node, staged_cs_node) 368 + staged_cs_put(hdev, staged_cs); 540 369 } 541 370 542 - spin_unlock(&hdev->cs_mirror_lock); 371 + /* A staged CS will be a member in the list only after it 372 + * was submitted. We used 'cs_mirror_lock' when inserting 373 + * it to list so we will use it again when removing it 374 + */ 375 + if (cs->submitted) { 376 + spin_lock(&hdev->cs_mirror_lock); 377 + list_del(&cs->staged_cs_node); 378 + spin_unlock(&hdev->cs_mirror_lock); 379 + } 543 380 } 544 381 545 382 out: ··· 626 461 } 627 462 628 463 static int allocate_cs(struct hl_device *hdev, struct hl_ctx *ctx, 629 - enum hl_cs_type cs_type, struct hl_cs **cs_new) 464 + enum hl_cs_type cs_type, u64 user_sequence, 465 + struct hl_cs **cs_new) 630 466 { 631 467 struct hl_cs_counters_atomic *cntr; 632 468 struct hl_fence *other = NULL; ··· 643 477 atomic64_inc(&cntr->out_of_mem_drop_cnt); 644 478 return -ENOMEM; 645 479 } 480 + 481 + /* increment refcnt for context */ 482 + hl_ctx_get(hdev, ctx); 646 483 647 484 cs->ctx = ctx; 648 485 cs->submitted = false; ··· 676 507 (hdev->asic_prop.max_pending_cs - 1)]; 677 508 678 509 if (other && !completion_done(&other->completion)) { 510 + /* If the following statement is true, it means we have reached 511 + * a point in which only part of the staged submission was 512 + * submitted and we don't have enough room in the 'cs_pending' 513 + * array for the rest of the submission. 514 + * This causes a deadlock because this CS will never be 515 + * completed as it depends on future CS's for completion. 516 + */ 517 + if (other->cs_sequence == user_sequence) 518 + dev_crit_ratelimited(hdev->dev, 519 + "Staged CS %llu deadlock due to lack of resources", 520 + user_sequence); 521 + 679 522 dev_dbg_ratelimited(hdev->dev, 680 523 "Rejecting CS because of too many in-flights CS\n"); 681 524 atomic64_inc(&ctx->cs_counters.max_cs_in_flight_drop_cnt); ··· 706 525 } 707 526 708 527 /* init hl_fence */ 709 - hl_fence_init(&cs_cmpl->base_fence); 528 + hl_fence_init(&cs_cmpl->base_fence, cs_cmpl->cs_seq); 710 529 711 530 cs->sequence = cs_cmpl->cs_seq; 712 531 ··· 730 549 kfree(cs_cmpl); 731 550 free_cs: 732 551 kfree(cs); 552 + hl_ctx_put(ctx); 733 553 return rc; 734 554 } 735 555 736 556 static void cs_rollback(struct hl_device *hdev, struct hl_cs *cs) 737 557 { 738 558 struct hl_cs_job *job, *tmp; 559 + 560 + staged_cs_put(hdev, cs); 739 561 740 562 list_for_each_entry_safe(job, tmp, &cs->job_list, cs_node) 741 563 complete_job(hdev, job); ··· 749 565 int i; 750 566 struct hl_cs *cs, *tmp; 751 567 752 - /* flush all completions */ 568 + /* flush all completions before iterating over the CS mirror list in 569 + * order to avoid a race with the release functions 570 + */ 753 571 for (i = 0 ; i < hdev->asic_prop.completion_queues_count ; i++) 754 572 flush_workqueue(hdev->cq_wq[i]); 755 573 ··· 760 574 cs_get(cs); 761 575 cs->aborted = true; 762 576 dev_warn_ratelimited(hdev->dev, "Killing CS %d.%llu\n", 763 - cs->ctx->asid, cs->sequence); 577 + cs->ctx->asid, cs->sequence); 764 578 cs_rollback(hdev, cs); 765 579 cs_put(cs); 580 + } 581 + } 582 + 583 + void hl_pending_cb_list_flush(struct hl_ctx *ctx) 584 + { 585 + struct hl_pending_cb *pending_cb, *tmp; 586 + 587 + list_for_each_entry_safe(pending_cb, tmp, 588 + &ctx->pending_cb_list, cb_node) { 589 + list_del(&pending_cb->cb_node); 590 + hl_cb_put(pending_cb->cb); 591 + kfree(pending_cb); 766 592 } 767 593 } 768 594 ··· 932 734 return -EBUSY; 933 735 } 934 736 737 + if ((args->in.cs_flags & HL_CS_FLAGS_STAGED_SUBMISSION) && 738 + !hdev->supports_staged_submission) { 739 + dev_err(hdev->dev, "staged submission not supported"); 740 + return -EPERM; 741 + } 742 + 935 743 cs_type_flags = args->in.cs_flags & HL_CS_FLAGS_TYPE_MASK; 936 744 937 745 if (unlikely(cs_type_flags && !is_power_of_2(cs_type_flags))) { ··· 1009 805 return 0; 1010 806 } 1011 807 1012 - static int cs_ioctl_default(struct hl_fpriv *hpriv, void __user *chunks, 1013 - u32 num_chunks, u64 *cs_seq, bool timestamp) 808 + static int cs_staged_submission(struct hl_device *hdev, struct hl_cs *cs, 809 + u64 sequence, u32 flags) 1014 810 { 1015 - bool int_queues_only = true; 811 + if (!(flags & HL_CS_FLAGS_STAGED_SUBMISSION)) 812 + return 0; 813 + 814 + cs->staged_last = !!(flags & HL_CS_FLAGS_STAGED_SUBMISSION_LAST); 815 + cs->staged_first = !!(flags & HL_CS_FLAGS_STAGED_SUBMISSION_FIRST); 816 + 817 + if (cs->staged_first) { 818 + /* Staged CS sequence is the first CS sequence */ 819 + INIT_LIST_HEAD(&cs->staged_cs_node); 820 + cs->staged_sequence = cs->sequence; 821 + } else { 822 + /* User sequence will be validated in 'hl_hw_queue_schedule_cs' 823 + * under the cs_mirror_lock 824 + */ 825 + cs->staged_sequence = sequence; 826 + } 827 + 828 + /* Increment CS reference if needed */ 829 + staged_cs_get(hdev, cs); 830 + 831 + cs->staged_cs = true; 832 + 833 + return 0; 834 + } 835 + 836 + static int cs_ioctl_default(struct hl_fpriv *hpriv, void __user *chunks, 837 + u32 num_chunks, u64 *cs_seq, u32 flags) 838 + { 839 + bool staged_mid, int_queues_only = true; 1016 840 struct hl_device *hdev = hpriv->hdev; 1017 841 struct hl_cs_chunk *cs_chunk_array; 1018 842 struct hl_cs_counters_atomic *cntr; ··· 1048 816 struct hl_cs_job *job; 1049 817 struct hl_cs *cs; 1050 818 struct hl_cb *cb; 819 + u64 user_sequence; 1051 820 int rc, i; 1052 821 1053 822 cntr = &hdev->aggregated_cs_counters; 823 + user_sequence = *cs_seq; 1054 824 *cs_seq = ULLONG_MAX; 1055 825 1056 826 rc = hl_cs_copy_chunk_array(hdev, &cs_chunk_array, chunks, num_chunks, ··· 1060 826 if (rc) 1061 827 goto out; 1062 828 1063 - /* increment refcnt for context */ 1064 - hl_ctx_get(hdev, hpriv->ctx); 829 + if ((flags & HL_CS_FLAGS_STAGED_SUBMISSION) && 830 + !(flags & HL_CS_FLAGS_STAGED_SUBMISSION_FIRST)) 831 + staged_mid = true; 832 + else 833 + staged_mid = false; 1065 834 1066 - rc = allocate_cs(hdev, hpriv->ctx, CS_TYPE_DEFAULT, &cs); 1067 - if (rc) { 1068 - hl_ctx_put(hpriv->ctx); 835 + rc = allocate_cs(hdev, hpriv->ctx, CS_TYPE_DEFAULT, 836 + staged_mid ? user_sequence : ULLONG_MAX, &cs); 837 + if (rc) 1069 838 goto free_cs_chunk_array; 1070 - } 1071 839 1072 - cs->timestamp = !!timestamp; 840 + cs->timestamp = !!(flags & HL_CS_FLAGS_TIMESTAMP); 1073 841 *cs_seq = cs->sequence; 1074 842 1075 843 hl_debugfs_add_cs(cs); 844 + 845 + rc = cs_staged_submission(hdev, cs, user_sequence, flags); 846 + if (rc) 847 + goto free_cs_object; 1076 848 1077 849 /* Validate ALL the CS chunks before submitting the CS */ 1078 850 for (i = 0 ; i < num_chunks ; i++) { ··· 1139 899 * Only increment for JOB on external or H/W queues, because 1140 900 * only for those JOBs we get completion 1141 901 */ 1142 - if (job->queue_type == QUEUE_TYPE_EXT || 1143 - job->queue_type == QUEUE_TYPE_HW) 902 + if (cs_needs_completion(cs) && 903 + (job->queue_type == QUEUE_TYPE_EXT || 904 + job->queue_type == QUEUE_TYPE_HW)) 1144 905 cs_get(cs); 1145 906 1146 907 hl_debugfs_add_job(hdev, job); ··· 1157 916 } 1158 917 } 1159 918 1160 - if (int_queues_only) { 919 + /* We allow a CS with any queue type combination as long as it does 920 + * not get a completion 921 + */ 922 + if (int_queues_only && cs_needs_completion(cs)) { 1161 923 atomic64_inc(&ctx->cs_counters.validation_drop_cnt); 1162 924 atomic64_inc(&cntr->validation_drop_cnt); 1163 925 dev_err(hdev->dev, 1164 - "Reject CS %d.%llu because only internal queues jobs are present\n", 926 + "Reject CS %d.%llu since it contains only internal queues jobs and needs completion\n", 1165 927 cs->ctx->asid, cs->sequence); 1166 928 rc = -EINVAL; 1167 929 goto free_cs_object; ··· 1195 951 free_cs_chunk_array: 1196 952 kfree(cs_chunk_array); 1197 953 out: 954 + return rc; 955 + } 956 + 957 + static int pending_cb_create_job(struct hl_device *hdev, struct hl_ctx *ctx, 958 + struct hl_cs *cs, struct hl_cb *cb, u32 size, u32 hw_queue_id) 959 + { 960 + struct hw_queue_properties *hw_queue_prop; 961 + struct hl_cs_counters_atomic *cntr; 962 + struct hl_cs_job *job; 963 + 964 + hw_queue_prop = &hdev->asic_prop.hw_queues_props[hw_queue_id]; 965 + cntr = &hdev->aggregated_cs_counters; 966 + 967 + job = hl_cs_allocate_job(hdev, hw_queue_prop->type, true); 968 + if (!job) { 969 + atomic64_inc(&ctx->cs_counters.out_of_mem_drop_cnt); 970 + atomic64_inc(&cntr->out_of_mem_drop_cnt); 971 + dev_err(hdev->dev, "Failed to allocate a new job\n"); 972 + return -ENOMEM; 973 + } 974 + 975 + job->id = 0; 976 + job->cs = cs; 977 + job->user_cb = cb; 978 + atomic_inc(&job->user_cb->cs_cnt); 979 + job->user_cb_size = size; 980 + job->hw_queue_id = hw_queue_id; 981 + job->patched_cb = job->user_cb; 982 + job->job_cb_size = job->user_cb_size; 983 + 984 + /* increment refcount as for external queues we get completion */ 985 + cs_get(cs); 986 + 987 + cs->jobs_in_queue_cnt[job->hw_queue_id]++; 988 + 989 + list_add_tail(&job->cs_node, &cs->job_list); 990 + 991 + hl_debugfs_add_job(hdev, job); 992 + 993 + return 0; 994 + } 995 + 996 + static int hl_submit_pending_cb(struct hl_fpriv *hpriv) 997 + { 998 + struct hl_device *hdev = hpriv->hdev; 999 + struct hl_ctx *ctx = hpriv->ctx; 1000 + struct hl_pending_cb *pending_cb, *tmp; 1001 + struct list_head local_cb_list; 1002 + struct hl_cs *cs; 1003 + struct hl_cb *cb; 1004 + u32 hw_queue_id; 1005 + u32 cb_size; 1006 + int process_list, rc = 0; 1007 + 1008 + if (list_empty(&ctx->pending_cb_list)) 1009 + return 0; 1010 + 1011 + process_list = atomic_cmpxchg(&ctx->thread_pending_cb_token, 1, 0); 1012 + 1013 + /* Only a single thread is allowed to process the list */ 1014 + if (!process_list) 1015 + return 0; 1016 + 1017 + if (list_empty(&ctx->pending_cb_list)) 1018 + goto free_pending_cb_token; 1019 + 1020 + /* move all list elements to a local list */ 1021 + INIT_LIST_HEAD(&local_cb_list); 1022 + spin_lock(&ctx->pending_cb_lock); 1023 + list_for_each_entry_safe(pending_cb, tmp, &ctx->pending_cb_list, 1024 + cb_node) 1025 + list_move_tail(&pending_cb->cb_node, &local_cb_list); 1026 + spin_unlock(&ctx->pending_cb_lock); 1027 + 1028 + rc = allocate_cs(hdev, ctx, CS_TYPE_DEFAULT, ULLONG_MAX, &cs); 1029 + if (rc) 1030 + goto add_list_elements; 1031 + 1032 + hl_debugfs_add_cs(cs); 1033 + 1034 + /* Iterate through pending cb list, create jobs and add to CS */ 1035 + list_for_each_entry(pending_cb, &local_cb_list, cb_node) { 1036 + cb = pending_cb->cb; 1037 + cb_size = pending_cb->cb_size; 1038 + hw_queue_id = pending_cb->hw_queue_id; 1039 + 1040 + rc = pending_cb_create_job(hdev, ctx, cs, cb, cb_size, 1041 + hw_queue_id); 1042 + if (rc) 1043 + goto free_cs_object; 1044 + } 1045 + 1046 + rc = hl_hw_queue_schedule_cs(cs); 1047 + if (rc) { 1048 + if (rc != -EAGAIN) 1049 + dev_err(hdev->dev, 1050 + "Failed to submit CS %d.%llu (%d)\n", 1051 + ctx->asid, cs->sequence, rc); 1052 + goto free_cs_object; 1053 + } 1054 + 1055 + /* pending cb was scheduled successfully */ 1056 + list_for_each_entry_safe(pending_cb, tmp, &local_cb_list, cb_node) { 1057 + list_del(&pending_cb->cb_node); 1058 + kfree(pending_cb); 1059 + } 1060 + 1061 + cs_put(cs); 1062 + 1063 + goto free_pending_cb_token; 1064 + 1065 + free_cs_object: 1066 + cs_rollback(hdev, cs); 1067 + cs_put(cs); 1068 + add_list_elements: 1069 + spin_lock(&ctx->pending_cb_lock); 1070 + list_for_each_entry_safe_reverse(pending_cb, tmp, &local_cb_list, 1071 + cb_node) 1072 + list_move(&pending_cb->cb_node, &ctx->pending_cb_list); 1073 + spin_unlock(&ctx->pending_cb_lock); 1074 + free_pending_cb_token: 1075 + atomic_set(&ctx->thread_pending_cb_token, 1); 1076 + 1198 1077 return rc; 1199 1078 } 1200 1079 ··· 1370 1003 rc = 0; 1371 1004 } else { 1372 1005 rc = cs_ioctl_default(hpriv, chunks, num_chunks, 1373 - cs_seq, false); 1006 + cs_seq, 0); 1374 1007 } 1375 1008 1376 1009 mutex_unlock(&hpriv->restore_phase_mutex); ··· 1642 1275 } 1643 1276 } 1644 1277 1645 - /* increment refcnt for context */ 1646 - hl_ctx_get(hdev, ctx); 1647 - 1648 - rc = allocate_cs(hdev, ctx, cs_type, &cs); 1278 + rc = allocate_cs(hdev, ctx, cs_type, ULLONG_MAX, &cs); 1649 1279 if (rc) { 1650 1280 if (cs_type == CS_TYPE_WAIT || 1651 1281 cs_type == CS_TYPE_COLLECTIVE_WAIT) 1652 1282 hl_fence_put(sig_fence); 1653 - hl_ctx_put(ctx); 1654 1283 goto free_cs_chunk_array; 1655 1284 } 1656 1285 ··· 1709 1346 enum hl_cs_type cs_type; 1710 1347 u64 cs_seq = ULONG_MAX; 1711 1348 void __user *chunks; 1712 - u32 num_chunks; 1349 + u32 num_chunks, flags; 1713 1350 int rc; 1714 1351 1715 1352 rc = hl_cs_sanity_checks(hpriv, args); ··· 1720 1357 if (rc) 1721 1358 goto out; 1722 1359 1360 + rc = hl_submit_pending_cb(hpriv); 1361 + if (rc) 1362 + goto out; 1363 + 1723 1364 cs_type = hl_cs_get_cs_type(args->in.cs_flags & 1724 1365 ~HL_CS_FLAGS_FORCE_RESTORE); 1725 1366 chunks = (void __user *) (uintptr_t) args->in.chunks_execute; 1726 1367 num_chunks = args->in.num_chunks_execute; 1368 + flags = args->in.cs_flags; 1369 + 1370 + /* In case this is a staged CS, user should supply the CS sequence */ 1371 + if ((flags & HL_CS_FLAGS_STAGED_SUBMISSION) && 1372 + !(flags & HL_CS_FLAGS_STAGED_SUBMISSION_FIRST)) 1373 + cs_seq = args->in.seq; 1727 1374 1728 1375 switch (cs_type) { 1729 1376 case CS_TYPE_SIGNAL: ··· 1744 1371 break; 1745 1372 default: 1746 1373 rc = cs_ioctl_default(hpriv, chunks, num_chunks, &cs_seq, 1747 - args->in.cs_flags & HL_CS_FLAGS_TIMESTAMP); 1374 + args->in.cs_flags); 1748 1375 break; 1749 1376 } 1750 1377

+26 -7

drivers/misc/habanalabs/common/context.c

··· 12 12 static void hl_ctx_fini(struct hl_ctx *ctx) 13 13 { 14 14 struct hl_device *hdev = ctx->hdev; 15 - u64 idle_mask = 0; 15 + u64 idle_mask[HL_BUSY_ENGINES_MASK_EXT_SIZE] = {0}; 16 16 int i; 17 + 18 + /* Release all allocated pending cb's, those cb's were never 19 + * scheduled so it is safe to release them here 20 + */ 21 + hl_pending_cb_list_flush(ctx); 17 22 18 23 /* 19 24 * If we arrived here, there are no jobs waiting for this context ··· 55 50 56 51 if ((!hdev->pldm) && (hdev->pdev) && 57 52 (!hdev->asic_funcs->is_device_idle(hdev, 58 - &idle_mask, NULL))) 53 + idle_mask, 54 + HL_BUSY_ENGINES_MASK_EXT_SIZE, NULL))) 59 55 dev_notice(hdev->dev, 60 - "device not idle after user context is closed (0x%llx)\n", 61 - idle_mask); 56 + "device not idle after user context is closed (0x%llx, 0x%llx)\n", 57 + idle_mask[0], idle_mask[1]); 62 58 } else { 63 59 dev_dbg(hdev->dev, "closing kernel context\n"); 60 + hdev->asic_funcs->ctx_fini(ctx); 61 + hl_vm_ctx_fini(ctx); 64 62 hl_mmu_ctx_fini(ctx); 65 63 } 66 64 } ··· 148 140 kref_init(&ctx->refcount); 149 141 150 142 ctx->cs_sequence = 1; 143 + INIT_LIST_HEAD(&ctx->pending_cb_list); 144 + spin_lock_init(&ctx->pending_cb_lock); 151 145 spin_lock_init(&ctx->cs_lock); 152 146 atomic_set(&ctx->thread_ctx_switch_token, 1); 147 + atomic_set(&ctx->thread_pending_cb_token, 1); 153 148 ctx->thread_ctx_switch_wait_token = 0; 154 149 ctx->cs_pending = kcalloc(hdev->asic_prop.max_pending_cs, 155 150 sizeof(struct hl_fence *), ··· 162 151 163 152 if (is_kernel_ctx) { 164 153 ctx->asid = HL_KERNEL_ASID_ID; /* Kernel driver gets ASID 0 */ 165 - rc = hl_mmu_ctx_init(ctx); 154 + rc = hl_vm_ctx_init(ctx); 166 155 if (rc) { 167 - dev_err(hdev->dev, "Failed to init mmu ctx module\n"); 156 + dev_err(hdev->dev, "Failed to init mem ctx module\n"); 157 + rc = -ENOMEM; 168 158 goto err_free_cs_pending; 159 + } 160 + 161 + rc = hdev->asic_funcs->ctx_init(ctx); 162 + if (rc) { 163 + dev_err(hdev->dev, "ctx_init failed\n"); 164 + goto err_vm_ctx_fini; 169 165 } 170 166 } else { 171 167 ctx->asid = hl_asid_alloc(hdev); ··· 212 194 err_vm_ctx_fini: 213 195 hl_vm_ctx_fini(ctx); 214 196 err_asid_free: 215 - hl_asid_free(hdev, ctx->asid); 197 + if (ctx->asid != HL_KERNEL_ASID_ID) 198 + hl_asid_free(hdev, ctx->asid); 216 199 err_free_cs_pending: 217 200 kfree(ctx->cs_pending); 218 201

+38 -5

drivers/misc/habanalabs/common/debugfs.c

··· 310 310 struct hl_dbg_device_entry *dev_entry = entry->dev_entry; 311 311 struct hl_device *hdev = dev_entry->hdev; 312 312 struct hl_ctx *ctx; 313 - struct hl_mmu_hop_info hops_info; 314 - u64 virt_addr = dev_entry->mmu_addr; 313 + struct hl_mmu_hop_info hops_info = {0}; 314 + u64 virt_addr = dev_entry->mmu_addr, phys_addr; 315 315 int i; 316 316 317 317 if (!hdev->mmu_enable) ··· 333 333 return 0; 334 334 } 335 335 336 - seq_printf(s, "asid: %u, virt_addr: 0x%llx\n", 337 - dev_entry->mmu_asid, dev_entry->mmu_addr); 336 + phys_addr = hops_info.hop_info[hops_info.used_hops - 1].hop_pte_val; 337 + 338 + if (hops_info.scrambled_vaddr && 339 + (dev_entry->mmu_addr != hops_info.scrambled_vaddr)) 340 + seq_printf(s, 341 + "asid: %u, virt_addr: 0x%llx, scrambled virt_addr: 0x%llx,\nphys_addr: 0x%llx, scrambled_phys_addr: 0x%llx\n", 342 + dev_entry->mmu_asid, dev_entry->mmu_addr, 343 + hops_info.scrambled_vaddr, 344 + hops_info.unscrambled_paddr, phys_addr); 345 + else 346 + seq_printf(s, 347 + "asid: %u, virt_addr: 0x%llx, phys_addr: 0x%llx\n", 348 + dev_entry->mmu_asid, dev_entry->mmu_addr, phys_addr); 338 349 339 350 for (i = 0 ; i < hops_info.used_hops ; i++) { 340 351 seq_printf(s, "hop%d_addr: 0x%llx\n", ··· 414 403 return 0; 415 404 } 416 405 417 - hdev->asic_funcs->is_device_idle(hdev, NULL, s); 406 + hdev->asic_funcs->is_device_idle(hdev, NULL, 0, s); 418 407 419 408 return 0; 420 409 } ··· 876 865 return count; 877 866 } 878 867 868 + static ssize_t hl_security_violations_read(struct file *f, char __user *buf, 869 + size_t count, loff_t *ppos) 870 + { 871 + struct hl_dbg_device_entry *entry = file_inode(f)->i_private; 872 + struct hl_device *hdev = entry->hdev; 873 + 874 + hdev->asic_funcs->ack_protection_bits_errors(hdev); 875 + 876 + return 0; 877 + } 878 + 879 879 static const struct file_operations hl_data32b_fops = { 880 880 .owner = THIS_MODULE, 881 881 .read = hl_data_read32, ··· 942 920 .owner = THIS_MODULE, 943 921 .read = hl_stop_on_err_read, 944 922 .write = hl_stop_on_err_write 923 + }; 924 + 925 + static const struct file_operations hl_security_violations_fops = { 926 + .owner = THIS_MODULE, 927 + .read = hl_security_violations_read 945 928 }; 946 929 947 930 static const struct hl_info_list hl_debugfs_list[] = { ··· 1097 1070 dev_entry->root, 1098 1071 dev_entry, 1099 1072 &hl_stop_on_err_fops); 1073 + 1074 + debugfs_create_file("dump_security_violations", 1075 + 0644, 1076 + dev_entry->root, 1077 + dev_entry, 1078 + &hl_security_violations_fops); 1100 1079 1101 1080 for (i = 0, entry = dev_entry->entry_arr ; i < count ; i++, entry++) { 1102 1081

+15 -8

drivers/misc/habanalabs/common/device.c

··· 142 142 switch (vm_pgoff & HL_MMAP_TYPE_MASK) { 143 143 case HL_MMAP_TYPE_CB: 144 144 return hl_cb_mmap(hpriv, vma); 145 + 146 + case HL_MMAP_TYPE_BLOCK: 147 + return hl_hw_block_mmap(hpriv, vma); 145 148 } 146 149 147 150 return -EINVAL; ··· 376 373 377 374 mutex_init(&hdev->send_cpu_message_lock); 378 375 mutex_init(&hdev->debug_lock); 379 - mutex_init(&hdev->mmu_cache_lock); 380 376 INIT_LIST_HEAD(&hdev->cs_mirror_list); 381 377 spin_lock_init(&hdev->cs_mirror_lock); 382 378 INIT_LIST_HEAD(&hdev->fpriv_list); ··· 416 414 { 417 415 int i; 418 416 419 - mutex_destroy(&hdev->mmu_cache_lock); 420 417 mutex_destroy(&hdev->debug_lock); 421 418 mutex_destroy(&hdev->send_cpu_message_lock); 422 419 ··· 1315 1314 1316 1315 hdev->compute_ctx = NULL; 1317 1316 1317 + hl_debugfs_add_device(hdev); 1318 + 1319 + /* debugfs nodes are created in hl_ctx_init so it must be called after 1320 + * hl_debugfs_add_device. 1321 + */ 1318 1322 rc = hl_ctx_init(hdev, hdev->kernel_ctx, true); 1319 1323 if (rc) { 1320 1324 dev_err(hdev->dev, "failed to initialize kernel context\n"); 1321 1325 kfree(hdev->kernel_ctx); 1322 - goto mmu_fini; 1326 + goto remove_device_from_debugfs; 1323 1327 } 1324 1328 1325 1329 rc = hl_cb_pool_init(hdev); ··· 1332 1326 dev_err(hdev->dev, "failed to initialize CB pool\n"); 1333 1327 goto release_ctx; 1334 1328 } 1335 - 1336 - hl_debugfs_add_device(hdev); 1337 1329 1338 1330 /* 1339 1331 * From this point, in case of an error, add char devices and create ··· 1421 1417 if (hl_ctx_put(hdev->kernel_ctx) != 1) 1422 1418 dev_err(hdev->dev, 1423 1419 "kernel ctx is still alive on initialization failure\n"); 1420 + remove_device_from_debugfs: 1421 + hl_debugfs_remove_device(hdev); 1424 1422 mmu_fini: 1425 1423 hl_mmu_fini(hdev); 1426 1424 eq_fini: ··· 1488 1482 usleep_range(50, 200); 1489 1483 rc = atomic_cmpxchg(&hdev->in_reset, 0, 1); 1490 1484 if (ktime_compare(ktime_get(), timeout) > 0) { 1491 - WARN(1, "Failed to remove device because reset function did not finish\n"); 1485 + dev_crit(hdev->dev, 1486 + "Failed to remove device because reset function did not finish\n"); 1492 1487 return; 1493 1488 } 1494 1489 } ··· 1522 1515 1523 1516 device_late_fini(hdev); 1524 1517 1525 - hl_debugfs_remove_device(hdev); 1526 - 1527 1518 /* 1528 1519 * Halt the engines and disable interrupts so we won't get any more 1529 1520 * completions from H/W and we won't have any accesses from the ··· 1552 1547 /* Release kernel context */ 1553 1548 if ((hdev->kernel_ctx) && (hl_ctx_put(hdev->kernel_ctx) != 1)) 1554 1549 dev_err(hdev->dev, "kernel ctx is still alive\n"); 1550 + 1551 + hl_debugfs_remove_device(hdev); 1555 1552 1556 1553 hl_vm_fini(hdev); 1557 1554

+90 -53

drivers/misc/habanalabs/common/firmware_if.c

··· 279 279 return rc; 280 280 } 281 281 282 + static int fw_read_errors(struct hl_device *hdev, u32 boot_err0_reg, 283 + u32 cpu_security_boot_status_reg) 284 + { 285 + u32 err_val, security_val; 286 + 287 + /* Some of the firmware status codes are deprecated in newer f/w 288 + * versions. In those versions, the errors are reported 289 + * in different registers. Therefore, we need to check those 290 + * registers and print the exact errors. Moreover, there 291 + * may be multiple errors, so we need to report on each error 292 + * separately. Some of the error codes might indicate a state 293 + * that is not an error per-se, but it is an error in production 294 + * environment 295 + */ 296 + err_val = RREG32(boot_err0_reg); 297 + if (!(err_val & CPU_BOOT_ERR0_ENABLED)) 298 + return 0; 299 + 300 + if (err_val & CPU_BOOT_ERR0_DRAM_INIT_FAIL) 301 + dev_err(hdev->dev, 302 + "Device boot error - DRAM initialization failed\n"); 303 + if (err_val & CPU_BOOT_ERR0_FIT_CORRUPTED) 304 + dev_err(hdev->dev, "Device boot error - FIT image corrupted\n"); 305 + if (err_val & CPU_BOOT_ERR0_TS_INIT_FAIL) 306 + dev_err(hdev->dev, 307 + "Device boot error - Thermal Sensor initialization failed\n"); 308 + if (err_val & CPU_BOOT_ERR0_DRAM_SKIPPED) 309 + dev_warn(hdev->dev, 310 + "Device boot warning - Skipped DRAM initialization\n"); 311 + 312 + if (err_val & CPU_BOOT_ERR0_BMC_WAIT_SKIPPED) { 313 + if (hdev->bmc_enable) 314 + dev_warn(hdev->dev, 315 + "Device boot error - Skipped waiting for BMC\n"); 316 + else 317 + err_val &= ~CPU_BOOT_ERR0_BMC_WAIT_SKIPPED; 318 + } 319 + 320 + if (err_val & CPU_BOOT_ERR0_NIC_DATA_NOT_RDY) 321 + dev_err(hdev->dev, 322 + "Device boot error - Serdes data from BMC not available\n"); 323 + if (err_val & CPU_BOOT_ERR0_NIC_FW_FAIL) 324 + dev_err(hdev->dev, 325 + "Device boot error - NIC F/W initialization failed\n"); 326 + if (err_val & CPU_BOOT_ERR0_SECURITY_NOT_RDY) 327 + dev_warn(hdev->dev, 328 + "Device boot warning - security not ready\n"); 329 + if (err_val & CPU_BOOT_ERR0_SECURITY_FAIL) 330 + dev_err(hdev->dev, "Device boot error - security failure\n"); 331 + if (err_val & CPU_BOOT_ERR0_EFUSE_FAIL) 332 + dev_err(hdev->dev, "Device boot error - eFuse failure\n"); 333 + if (err_val & CPU_BOOT_ERR0_PLL_FAIL) 334 + dev_err(hdev->dev, "Device boot error - PLL failure\n"); 335 + 336 + security_val = RREG32(cpu_security_boot_status_reg); 337 + if (security_val & CPU_BOOT_DEV_STS0_ENABLED) 338 + dev_dbg(hdev->dev, "Device security status %#x\n", 339 + security_val); 340 + 341 + if (err_val & ~CPU_BOOT_ERR0_ENABLED) 342 + return -EIO; 343 + 344 + return 0; 345 + } 346 + 282 347 int hl_fw_cpucp_info_get(struct hl_device *hdev, 283 - u32 cpu_security_boot_status_reg) 348 + u32 cpu_security_boot_status_reg, 349 + u32 boot_err0_reg) 284 350 { 285 351 struct asic_fixed_properties *prop = &hdev->asic_prop; 286 352 struct cpucp_packet pkt = {}; ··· 377 311 if (rc) { 378 312 dev_err(hdev->dev, 379 313 "Failed to handle CPU-CP info pkt, error %d\n", rc); 314 + goto out; 315 + } 316 + 317 + rc = fw_read_errors(hdev, boot_err0_reg, cpu_security_boot_status_reg); 318 + if (rc) { 319 + dev_err(hdev->dev, "Errors in device boot\n"); 380 320 goto out; 381 321 } 382 322 ··· 555 483 return rc; 556 484 } 557 485 558 - static void fw_read_errors(struct hl_device *hdev, u32 boot_err0_reg, 559 - u32 cpu_security_boot_status_reg) 560 - { 561 - u32 err_val, security_val; 562 - 563 - /* Some of the firmware status codes are deprecated in newer f/w 564 - * versions. In those versions, the errors are reported 565 - * in different registers. Therefore, we need to check those 566 - * registers and print the exact errors. Moreover, there 567 - * may be multiple errors, so we need to report on each error 568 - * separately. Some of the error codes might indicate a state 569 - * that is not an error per-se, but it is an error in production 570 - * environment 571 - */ 572 - err_val = RREG32(boot_err0_reg); 573 - if (!(err_val & CPU_BOOT_ERR0_ENABLED)) 574 - return; 575 - 576 - if (err_val & CPU_BOOT_ERR0_DRAM_INIT_FAIL) 577 - dev_err(hdev->dev, 578 - "Device boot error - DRAM initialization failed\n"); 579 - if (err_val & CPU_BOOT_ERR0_FIT_CORRUPTED) 580 - dev_err(hdev->dev, "Device boot error - FIT image corrupted\n"); 581 - if (err_val & CPU_BOOT_ERR0_TS_INIT_FAIL) 582 - dev_err(hdev->dev, 583 - "Device boot error - Thermal Sensor initialization failed\n"); 584 - if (err_val & CPU_BOOT_ERR0_DRAM_SKIPPED) 585 - dev_warn(hdev->dev, 586 - "Device boot warning - Skipped DRAM initialization\n"); 587 - if (err_val & CPU_BOOT_ERR0_BMC_WAIT_SKIPPED) 588 - dev_warn(hdev->dev, 589 - "Device boot error - Skipped waiting for BMC\n"); 590 - if (err_val & CPU_BOOT_ERR0_NIC_DATA_NOT_RDY) 591 - dev_err(hdev->dev, 592 - "Device boot error - Serdes data from BMC not available\n"); 593 - if (err_val & CPU_BOOT_ERR0_NIC_FW_FAIL) 594 - dev_err(hdev->dev, 595 - "Device boot error - NIC F/W initialization failed\n"); 596 - if (err_val & CPU_BOOT_ERR0_SECURITY_NOT_RDY) 597 - dev_warn(hdev->dev, 598 - "Device boot warning - security not ready\n"); 599 - if (err_val & CPU_BOOT_ERR0_SECURITY_FAIL) 600 - dev_err(hdev->dev, "Device boot error - security failure\n"); 601 - if (err_val & CPU_BOOT_ERR0_EFUSE_FAIL) 602 - dev_err(hdev->dev, "Device boot error - eFuse failure\n"); 603 - 604 - security_val = RREG32(cpu_security_boot_status_reg); 605 - if (security_val & CPU_BOOT_DEV_STS0_ENABLED) 606 - dev_dbg(hdev->dev, "Device security status %#x\n", 607 - security_val); 608 - } 609 - 610 486 static void detect_cpu_boot_status(struct hl_device *hdev, u32 status) 611 487 { 612 488 /* Some of the status codes below are deprecated in newer f/w ··· 679 659 prop->fw_security_disabled = true; 680 660 } 681 661 662 + dev_dbg(hdev->dev, "Firmware preboot security status %#x\n", 663 + security_status); 664 + 682 665 dev_dbg(hdev->dev, "Firmware preboot hard-reset is %s\n", 683 666 prop->hard_reset_done_by_fw ? "enabled" : "disabled"); 684 667 ··· 776 753 if (prop->fw_boot_cpu_security_map & 777 754 CPU_BOOT_DEV_STS0_FW_HARD_RST_EN) 778 755 prop->hard_reset_done_by_fw = true; 756 + 757 + dev_dbg(hdev->dev, 758 + "Firmware boot CPU security status %#x\n", 759 + prop->fw_boot_cpu_security_map); 779 760 } 780 761 781 762 dev_dbg(hdev->dev, "Firmware boot CPU hard-reset is %s\n", ··· 853 826 goto out; 854 827 } 855 828 829 + rc = fw_read_errors(hdev, boot_err0_reg, cpu_security_boot_status_reg); 830 + if (rc) 831 + return rc; 832 + 856 833 /* Clear reset status since we need to read again from app */ 857 834 prop->hard_reset_done_by_fw = false; 858 835 ··· 868 837 if (prop->fw_app_security_map & 869 838 CPU_BOOT_DEV_STS0_FW_HARD_RST_EN) 870 839 prop->hard_reset_done_by_fw = true; 840 + 841 + dev_dbg(hdev->dev, 842 + "Firmware application CPU security status %#x\n", 843 + prop->fw_app_security_map); 871 844 } 872 845 873 846 dev_dbg(hdev->dev, "Firmware application CPU hard-reset is %s\n", 874 847 prop->hard_reset_done_by_fw ? "enabled" : "disabled"); 875 848 876 849 dev_info(hdev->dev, "Successfully loaded firmware to device\n"); 850 + 851 + return 0; 877 852 878 853 out: 879 854 fw_read_errors(hdev, boot_err0_reg, cpu_security_boot_status_reg);

+89 -11

drivers/misc/habanalabs/common/habanalabs.h

··· 28 28 #define HL_NAME "habanalabs" 29 29 30 30 /* Use upper bits of mmap offset to store habana driver specific information. 31 - * bits[63:62] - Encode mmap type 31 + * bits[63:61] - Encode mmap type 32 32 * bits[45:0] - mmap offset value 33 33 * 34 34 * NOTE: struct vm_area_struct.vm_pgoff uses offset in pages. Hence, these 35 35 * defines are w.r.t to PAGE_SIZE 36 36 */ 37 - #define HL_MMAP_TYPE_SHIFT (62 - PAGE_SHIFT) 38 - #define HL_MMAP_TYPE_MASK (0x3ull << HL_MMAP_TYPE_SHIFT) 37 + #define HL_MMAP_TYPE_SHIFT (61 - PAGE_SHIFT) 38 + #define HL_MMAP_TYPE_MASK (0x7ull << HL_MMAP_TYPE_SHIFT) 39 + #define HL_MMAP_TYPE_BLOCK (0x4ull << HL_MMAP_TYPE_SHIFT) 39 40 #define HL_MMAP_TYPE_CB (0x2ull << HL_MMAP_TYPE_SHIFT) 40 41 41 - #define HL_MMAP_OFFSET_VALUE_MASK (0x3FFFFFFFFFFFull >> PAGE_SHIFT) 42 + #define HL_MMAP_OFFSET_VALUE_MASK (0x1FFFFFFFFFFFull >> PAGE_SHIFT) 42 43 #define HL_MMAP_OFFSET_VALUE_GET(off) (off & HL_MMAP_OFFSET_VALUE_MASK) 43 44 44 45 #define HL_PENDING_RESET_PER_SEC 10 ··· 409 408 * @sync_stream_first_mon: first monitor available for sync stream use 410 409 * @first_available_user_sob: first sob available for the user 411 410 * @first_available_user_mon: first monitor available for the user 411 + * @first_available_user_msix_interrupt: first available msix interrupt 412 + * reserved for the user 412 413 * @tpc_enabled_mask: which TPCs are enabled. 413 414 * @completion_queues_count: number of completion queues. 414 415 * @fw_security_disabled: true if security measures are disabled in firmware, ··· 419 416 * from BOOT_DEV_STS0 420 417 * @dram_supports_virtual_memory: is there an MMU towards the DRAM 421 418 * @hard_reset_done_by_fw: true if firmware is handling hard reset flow 419 + * @num_functional_hbms: number of functional HBMs in each DCORE. 422 420 */ 423 421 struct asic_fixed_properties { 424 422 struct hw_queue_properties *hw_queues_props; ··· 472 468 u16 sync_stream_first_mon; 473 469 u16 first_available_user_sob[HL_MAX_DCORES]; 474 470 u16 first_available_user_mon[HL_MAX_DCORES]; 471 + u16 first_available_user_msix_interrupt; 475 472 u8 tpc_enabled_mask; 476 473 u8 completion_queues_count; 477 474 u8 fw_security_disabled; 478 475 u8 fw_security_status_valid; 479 476 u8 dram_supports_virtual_memory; 480 477 u8 hard_reset_done_by_fw; 478 + u8 num_functional_hbms; 481 479 }; 482 480 483 481 /** 484 482 * struct hl_fence - software synchronization primitive 485 483 * @completion: fence is implemented using completion 486 484 * @refcount: refcount for this fence 485 + * @cs_sequence: sequence of the corresponding command submission 487 486 * @error: mark this fence with error 488 487 * @timestamp: timestamp upon completion 489 488 * ··· 494 487 struct hl_fence { 495 488 struct completion completion; 496 489 struct kref refcount; 490 + u64 cs_sequence; 497 491 int error; 498 492 ktime_t timestamp; 499 493 }; ··· 854 846 * @collective_wait_init_cs: Generate collective master/slave packets 855 847 * and place them in the relevant cs jobs 856 848 * @collective_wait_create_jobs: allocate collective wait cs jobs 849 + * @scramble_addr: Routine to scramble the address prior of mapping it 850 + * in the MMU. 851 + * @descramble_addr: Routine to de-scramble the address prior of 852 + * showing it to users. 853 + * @ack_protection_bits_errors: ack and dump all security violations 854 + * @get_hw_block_id: retrieve a HW block id to be used by the user to mmap it. 855 + * @hw_block_mmap: mmap a HW block with a given id. 857 856 */ 858 857 struct hl_asic_funcs { 859 858 int (*early_init)(struct hl_device *hdev); ··· 933 918 void (*set_clock_gating)(struct hl_device *hdev); 934 919 void (*disable_clock_gating)(struct hl_device *hdev); 935 920 int (*debug_coresight)(struct hl_device *hdev, void *data); 936 - bool (*is_device_idle)(struct hl_device *hdev, u64 *mask, 937 - struct seq_file *s); 921 + bool (*is_device_idle)(struct hl_device *hdev, u64 *mask_arr, 922 + u8 mask_len, struct seq_file *s); 938 923 int (*soft_reset_late_init)(struct hl_device *hdev); 939 924 void (*hw_queues_lock)(struct hl_device *hdev); 940 925 void (*hw_queues_unlock)(struct hl_device *hdev); ··· 970 955 int (*collective_wait_create_jobs)(struct hl_device *hdev, 971 956 struct hl_ctx *ctx, struct hl_cs *cs, u32 wait_queue_id, 972 957 u32 collective_engine_id); 958 + u64 (*scramble_addr)(struct hl_device *hdev, u64 addr); 959 + u64 (*descramble_addr)(struct hl_device *hdev, u64 addr); 960 + void (*ack_protection_bits_errors)(struct hl_device *hdev); 961 + int (*get_hw_block_id)(struct hl_device *hdev, u64 block_addr, 962 + u32 *block_id); 963 + int (*hw_block_mmap)(struct hl_device *hdev, struct vm_area_struct *vma, 964 + u32 block_id, u32 block_size); 973 965 }; 974 966 975 967 ··· 1034 1012 }; 1035 1013 1036 1014 /** 1015 + * struct hl_pending_cb - pending command buffer structure 1016 + * @cb_node: cb node in pending cb list 1017 + * @cb: command buffer to send in next submission 1018 + * @cb_size: command buffer size 1019 + * @hw_queue_id: destination queue id 1020 + */ 1021 + struct hl_pending_cb { 1022 + struct list_head cb_node; 1023 + struct hl_cb *cb; 1024 + u32 cb_size; 1025 + u32 hw_queue_id; 1026 + }; 1027 + 1028 + /** 1037 1029 * struct hl_ctx - user/kernel context. 1038 1030 * @mem_hash: holds mapping from virtual address to virtual memory area 1039 1031 * descriptor (hl_vm_phys_pg_list or hl_userptr). ··· 1062 1026 * @mmu_lock: protects the MMU page tables. Any change to the PGT, modifying the 1063 1027 * MMU hash or walking the PGT requires talking this lock. 1064 1028 * @debugfs_list: node in debugfs list of contexts. 1029 + * pending_cb_list: list of pending command buffers waiting to be sent upon 1030 + * next user command submission context. 1065 1031 * @cs_counters: context command submission counters. 1066 1032 * @cb_va_pool: device VA pool for command buffers which are mapped to the 1067 1033 * device's MMU. ··· 1072 1034 * index to cs_pending array. 1073 1035 * @dram_default_hops: array that holds all hops addresses needed for default 1074 1036 * DRAM mapping. 1037 + * @pending_cb_lock: spinlock to protect pending cb list 1075 1038 * @cs_lock: spinlock to protect cs_sequence. 1076 1039 * @dram_phys_mem: amount of used physical DRAM memory by this context. 1077 1040 * @thread_ctx_switch_token: token to prevent multiple threads of the same 1078 1041 * context from running the context switch phase. 1079 1042 * Only a single thread should run it. 1043 + * @thread_pending_cb_token: token to prevent multiple threads from processing 1044 + * the pending CB list. Only a single thread should 1045 + * process the list since it is protected by a 1046 + * spinlock and we don't want to halt the entire 1047 + * command submission sequence. 1080 1048 * @thread_ctx_switch_wait_token: token to prevent the threads that didn't run 1081 1049 * the context switch phase from moving to their 1082 1050 * execution phase before the context switch phase ··· 1101 1057 struct mutex mem_hash_lock; 1102 1058 struct mutex mmu_lock; 1103 1059 struct list_head debugfs_list; 1060 + struct list_head pending_cb_list; 1104 1061 struct hl_cs_counters_atomic cs_counters; 1105 1062 struct gen_pool *cb_va_pool; 1106 1063 u64 cs_sequence; 1107 1064 u64 *dram_default_hops; 1065 + spinlock_t pending_cb_lock; 1108 1066 spinlock_t cs_lock; 1109 1067 atomic64_t dram_phys_mem; 1110 1068 atomic_t thread_ctx_switch_token; 1069 + atomic_t thread_pending_cb_token; 1111 1070 u32 thread_ctx_switch_wait_token; 1112 1071 u32 asid; 1113 1072 u32 handle; ··· 1169 1122 * @finish_work: workqueue object to run when CS is completed by H/W. 1170 1123 * @work_tdr: delayed work node for TDR. 1171 1124 * @mirror_node : node in device mirror list of command submissions. 1125 + * @staged_cs_node: node in the staged cs list. 1172 1126 * @debugfs_list: node in debugfs list of command submissions. 1173 1127 * @sequence: the sequence number of this CS. 1128 + * @staged_sequence: the sequence of the staged submission this CS is part of, 1129 + * relevant only if staged_cs is set. 1174 1130 * @type: CS_TYPE_*. 1175 1131 * @submitted: true if CS was submitted to H/W. 1176 1132 * @completed: true if CS was completed by device. ··· 1181 1131 * @tdr_active: true if TDR was activated for this CS (to prevent 1182 1132 * double TDR activation). 1183 1133 * @aborted: true if CS was aborted due to some device error. 1184 - * @timestamp: true if a timestmap must be captured upon completion 1134 + * @timestamp: true if a timestmap must be captured upon completion. 1135 + * @staged_last: true if this is the last staged CS and needs completion. 1136 + * @staged_first: true if this is the first staged CS and we need to receive 1137 + * timeout for this CS. 1138 + * @staged_cs: true if this CS is part of a staged submission. 1185 1139 */ 1186 1140 struct hl_cs { 1187 1141 u16 *jobs_in_queue_cnt; ··· 1198 1144 struct work_struct finish_work; 1199 1145 struct delayed_work work_tdr; 1200 1146 struct list_head mirror_node; 1147 + struct list_head staged_cs_node; 1201 1148 struct list_head debugfs_list; 1202 1149 u64 sequence; 1150 + u64 staged_sequence; 1203 1151 enum hl_cs_type type; 1204 1152 u8 submitted; 1205 1153 u8 completed; ··· 1209 1153 u8 tdr_active; 1210 1154 u8 aborted; 1211 1155 u8 timestamp; 1156 + u8 staged_last; 1157 + u8 staged_first; 1158 + u8 staged_cs; 1212 1159 }; 1213 1160 1214 1161 /** ··· 1282 1223 * MSG_PROT packets. Relevant only for GAUDI as GOYA doesn't 1283 1224 * have streams so the engine can't be busy by another 1284 1225 * stream. 1226 + * @completion: true if we need completion for this CS. 1285 1227 */ 1286 1228 struct hl_cs_parser { 1287 1229 struct hl_cb *user_cb; ··· 1297 1237 u8 job_id; 1298 1238 u8 is_kernel_allocated_cb; 1299 1239 u8 contains_dma_pkt; 1240 + u8 completion; 1300 1241 }; 1301 1242 1302 1243 /* ··· 1747 1686 * struct hl_mmu_hop_info - A structure describing the TLB hops and their 1748 1687 * hop-entries that were created in order to translate a virtual address to a 1749 1688 * physical one. 1689 + * @scrambled_vaddr: The value of the virtual address after scrambling. This 1690 + * address replaces the original virtual-address when mapped 1691 + * in the MMU tables. 1692 + * @unscrambled_paddr: The un-scrambled physical address. 1750 1693 * @hop_info: Array holding the per-hop information used for the translation. 1751 1694 * @used_hops: The number of hops used for the translation. 1695 + * @range_type: virtual address range type. 1752 1696 */ 1753 1697 struct hl_mmu_hop_info { 1698 + u64 scrambled_vaddr; 1699 + u64 unscrambled_paddr; 1754 1700 struct hl_mmu_per_hop_info hop_info[MMU_ARCH_5_HOPS]; 1755 1701 u32 used_hops; 1702 + enum hl_va_range_type range_type; 1756 1703 }; 1757 1704 1758 1705 /** ··· 1833 1764 * @asic_funcs: ASIC specific functions. 1834 1765 * @asic_specific: ASIC specific information to use only from ASIC files. 1835 1766 * @vm: virtual memory manager for MMU. 1836 - * @mmu_cache_lock: protects MMU cache invalidation as it can serve one context. 1837 1767 * @hwmon_dev: H/W monitor device. 1838 1768 * @pm_mng_profile: current power management profile. 1839 1769 * @hl_chip_info: ASIC's sensors information. ··· 1910 1842 * user processes 1911 1843 * @device_fini_pending: true if device_fini was called and might be 1912 1844 * waiting for the reset thread to finish 1845 + * @supports_staged_submission: true if staged submissions are supported 1913 1846 */ 1914 1847 struct hl_device { 1915 1848 struct pci_dev *pdev; ··· 1948 1879 const struct hl_asic_funcs *asic_funcs; 1949 1880 void *asic_specific; 1950 1881 struct hl_vm vm; 1951 - struct mutex mmu_cache_lock; 1952 1882 struct device *hwmon_dev; 1953 1883 enum hl_pm_mng_profile pm_mng_profile; 1954 1884 struct hwmon_chip_info *hl_chip_info; ··· 2016 1948 u8 needs_reset; 2017 1949 u8 process_kill_trial_cnt; 2018 1950 u8 device_fini_pending; 1951 + u8 supports_staged_submission; 2019 1952 2020 1953 /* Parameters for bring-up */ 2021 1954 u64 nic_ports_mask; ··· 2134 2065 int hl_hw_queue_schedule_cs(struct hl_cs *cs); 2135 2066 u32 hl_hw_queue_add_ptr(u32 ptr, u16 val); 2136 2067 void hl_hw_queue_inc_ci_kernel(struct hl_device *hdev, u32 hw_queue_id); 2137 - void hl_int_hw_queue_update_ci(struct hl_cs *cs); 2068 + void hl_hw_queue_update_ci(struct hl_cs *cs); 2138 2069 void hl_hw_queue_reset(struct hl_device *hdev, bool hard_reset); 2139 2070 2140 2071 #define hl_queue_inc_ptr(p) hl_hw_queue_add_ptr(p, 1) ··· 2190 2121 bool map_cb, u64 *handle); 2191 2122 int hl_cb_destroy(struct hl_device *hdev, struct hl_cb_mgr *mgr, u64 cb_handle); 2192 2123 int hl_cb_mmap(struct hl_fpriv *hpriv, struct vm_area_struct *vma); 2124 + int hl_hw_block_mmap(struct hl_fpriv *hpriv, struct vm_area_struct *vma); 2193 2125 struct hl_cb *hl_cb_get(struct hl_device *hdev, struct hl_cb_mgr *mgr, 2194 2126 u32 handle); 2195 2127 void hl_cb_put(struct hl_cb *cb); ··· 2204 2134 void hl_cb_va_pool_fini(struct hl_ctx *ctx); 2205 2135 2206 2136 void hl_cs_rollback_all(struct hl_device *hdev); 2137 + void hl_pending_cb_list_flush(struct hl_ctx *ctx); 2207 2138 struct hl_cs_job *hl_cs_allocate_job(struct hl_device *hdev, 2208 2139 enum hl_queue_type queue_type, bool is_kernel_allocated_cb); 2209 2140 void hl_sob_reset_error(struct kref *ref); ··· 2212 2141 void hl_fence_put(struct hl_fence *fence); 2213 2142 void hl_fence_get(struct hl_fence *fence); 2214 2143 void cs_get(struct hl_cs *cs); 2144 + bool cs_needs_completion(struct hl_cs *cs); 2145 + bool cs_needs_timeout(struct hl_cs *cs); 2146 + bool is_staged_cs_last_exists(struct hl_device *hdev, struct hl_cs *cs); 2147 + struct hl_cs *hl_staged_cs_find_first(struct hl_device *hdev, u64 cs_seq); 2215 2148 2216 2149 void goya_set_asic_funcs(struct hl_device *hdev); 2217 2150 void gaudi_set_asic_funcs(struct hl_device *hdev); ··· 2257 2182 int hl_mmu_va_to_pa(struct hl_ctx *ctx, u64 virt_addr, u64 *phys_addr); 2258 2183 int hl_mmu_get_tlb_info(struct hl_ctx *ctx, u64 virt_addr, 2259 2184 struct hl_mmu_hop_info *hops); 2185 + u64 hl_mmu_scramble_addr(struct hl_device *hdev, u64 addr); 2186 + u64 hl_mmu_descramble_addr(struct hl_device *hdev, u64 addr); 2260 2187 bool hl_is_dram_va(struct hl_device *hdev, u64 virt_addr); 2261 2188 2262 2189 int hl_fw_load_fw_to_device(struct hl_device *hdev, const char *fw_name, ··· 2276 2199 void *vaddr); 2277 2200 int hl_fw_send_heartbeat(struct hl_device *hdev); 2278 2201 int hl_fw_cpucp_info_get(struct hl_device *hdev, 2279 - u32 cpu_security_boot_status_reg); 2202 + u32 cpu_security_boot_status_reg, 2203 + u32 boot_err0_reg); 2280 2204 int hl_fw_get_eeprom_data(struct hl_device *hdev, void *data, size_t max_size); 2281 2205 int hl_fw_cpucp_pci_counters_get(struct hl_device *hdev, 2282 2206 struct hl_info_pci_counters *counters);

+18 -4

drivers/misc/habanalabs/common/habanalabs_ioctl.c

··· 57 57 58 58 hw_ip.device_id = hdev->asic_funcs->get_pci_id(hdev); 59 59 hw_ip.sram_base_address = prop->sram_user_base_address; 60 - hw_ip.dram_base_address = prop->dram_user_base_address; 60 + hw_ip.dram_base_address = 61 + hdev->mmu_enable && prop->dram_supports_virtual_memory ? 62 + prop->dmmu.start_addr : prop->dram_user_base_address; 61 63 hw_ip.tpc_enabled_mask = prop->tpc_enabled_mask; 62 64 hw_ip.sram_size = prop->sram_size - sram_kmd_size; 63 - hw_ip.dram_size = prop->dram_size - dram_kmd_size; 65 + 66 + if (hdev->mmu_enable) 67 + hw_ip.dram_size = 68 + DIV_ROUND_DOWN_ULL(prop->dram_size - dram_kmd_size, 69 + prop->dram_page_size) * 70 + prop->dram_page_size; 71 + else 72 + hw_ip.dram_size = prop->dram_size - dram_kmd_size; 73 + 64 74 if (hw_ip.dram_size > PAGE_SIZE) 65 75 hw_ip.dram_enabled = 1; 76 + hw_ip.dram_page_size = prop->dram_page_size; 66 77 hw_ip.num_of_events = prop->num_of_events; 67 78 68 79 memcpy(hw_ip.cpucp_version, prop->cpucp_info.cpucp_version, ··· 90 79 hw_ip.psoc_pci_pll_od = prop->psoc_pci_pll_od; 91 80 hw_ip.psoc_pci_pll_div_factor = prop->psoc_pci_pll_div_factor; 92 81 82 + hw_ip.first_available_interrupt_id = 83 + prop->first_available_user_msix_interrupt; 93 84 return copy_to_user(out, &hw_ip, 94 85 min((size_t)size, sizeof(hw_ip))) ? -EFAULT : 0; 95 86 } ··· 145 132 return -EINVAL; 146 133 147 134 hw_idle.is_idle = hdev->asic_funcs->is_device_idle(hdev, 148 - &hw_idle.busy_engines_mask_ext, NULL); 135 + hw_idle.busy_engines_mask_ext, 136 + HL_BUSY_ENGINES_MASK_EXT_SIZE, NULL); 149 137 hw_idle.busy_engines_mask = 150 - lower_32_bits(hw_idle.busy_engines_mask_ext); 138 + lower_32_bits(hw_idle.busy_engines_mask_ext[0]); 151 139 152 140 return copy_to_user(out, &hw_idle, 153 141 min((size_t) max_size, sizeof(hw_idle))) ? -EFAULT : 0;

+46 -5

drivers/misc/habanalabs/common/hw_queue.c

··· 38 38 return (abs(delta) - queue_len); 39 39 } 40 40 41 - void hl_int_hw_queue_update_ci(struct hl_cs *cs) 41 + void hl_hw_queue_update_ci(struct hl_cs *cs) 42 42 { 43 43 struct hl_device *hdev = cs->ctx->hdev; 44 44 struct hl_hw_queue *q; ··· 53 53 if (!hdev->asic_prop.max_queues || q->queue_type == QUEUE_TYPE_HW) 54 54 return; 55 55 56 + /* We must increment CI for every queue that will never get a 57 + * completion, there are 2 scenarios this can happen: 58 + * 1. All queues of a non completion CS will never get a completion. 59 + * 2. Internal queues never gets completion. 60 + */ 56 61 for (i = 0 ; i < hdev->asic_prop.max_queues ; i++, q++) { 57 - if (q->queue_type == QUEUE_TYPE_INT) 62 + if (!cs_needs_completion(cs) || q->queue_type == QUEUE_TYPE_INT) 58 63 atomic_add(cs->jobs_in_queue_cnt[i], &q->ci); 59 64 } 60 65 } ··· 297 292 len = job->job_cb_size; 298 293 ptr = cb->bus_address; 299 294 295 + /* Skip completion flow in case this is a non completion CS */ 296 + if (!cs_needs_completion(job->cs)) 297 + goto submit_bd; 298 + 300 299 cq_pkt.data = cpu_to_le32( 301 300 ((q->pi << CQ_ENTRY_SHADOW_INDEX_SHIFT) 302 301 & CQ_ENTRY_SHADOW_INDEX_MASK) | ··· 327 318 328 319 cq->pi = hl_cq_inc_ptr(cq->pi); 329 320 321 + submit_bd: 330 322 ext_and_hw_queue_submit_bd(hdev, q, ctl, len, ptr); 331 323 } 332 324 ··· 535 525 struct hl_cs_job *job, *tmp; 536 526 struct hl_hw_queue *q; 537 527 int rc = 0, i, cq_cnt; 528 + bool first_entry; 538 529 u32 max_queues; 539 530 540 531 cntr = &hdev->aggregated_cs_counters; ··· 559 548 switch (q->queue_type) { 560 549 case QUEUE_TYPE_EXT: 561 550 rc = ext_queue_sanity_checks(hdev, q, 562 - cs->jobs_in_queue_cnt[i], true); 551 + cs->jobs_in_queue_cnt[i], 552 + cs_needs_completion(cs) ? 553 + true : false); 563 554 break; 564 555 case QUEUE_TYPE_INT: 565 556 rc = int_queue_sanity_checks(hdev, q, ··· 596 583 hdev->asic_funcs->collective_wait_init_cs(cs); 597 584 598 585 spin_lock(&hdev->cs_mirror_lock); 586 + 587 + /* Verify staged CS exists and add to the staged list */ 588 + if (cs->staged_cs && !cs->staged_first) { 589 + struct hl_cs *staged_cs; 590 + 591 + staged_cs = hl_staged_cs_find_first(hdev, cs->staged_sequence); 592 + if (!staged_cs) { 593 + dev_err(hdev->dev, 594 + "Cannot find staged submission sequence %llu", 595 + cs->staged_sequence); 596 + rc = -EINVAL; 597 + goto unlock_cs_mirror; 598 + } 599 + 600 + if (is_staged_cs_last_exists(hdev, staged_cs)) { 601 + dev_err(hdev->dev, 602 + "Staged submission sequence %llu already submitted", 603 + cs->staged_sequence); 604 + rc = -EINVAL; 605 + goto unlock_cs_mirror; 606 + } 607 + 608 + list_add_tail(&cs->staged_cs_node, &staged_cs->staged_cs_node); 609 + } 610 + 599 611 list_add_tail(&cs->mirror_node, &hdev->cs_mirror_list); 600 612 601 613 /* Queue TDR if the CS is the first entry and if timeout is wanted */ 614 + first_entry = list_first_entry(&hdev->cs_mirror_list, 615 + struct hl_cs, mirror_node) == cs; 602 616 if ((hdev->timeout_jiffies != MAX_SCHEDULE_TIMEOUT) && 603 - (list_first_entry(&hdev->cs_mirror_list, 604 - struct hl_cs, mirror_node) == cs)) { 617 + first_entry && cs_needs_timeout(cs)) { 605 618 cs->tdr_active = true; 606 619 schedule_delayed_work(&cs->work_tdr, hdev->timeout_jiffies); 607 620 ··· 662 623 663 624 goto out; 664 625 626 + unlock_cs_mirror: 627 + spin_unlock(&hdev->cs_mirror_lock); 665 628 unroll_cq_resv: 666 629 q = &hdev->kernel_queues[0]; 667 630 for (i = 0 ; (i < max_queues) && (cq_cnt > 0) ; i++, q++) {

+371 -243

drivers/misc/habanalabs/common/memory.c

··· 14 14 15 15 #define HL_MMU_DEBUG 0 16 16 17 + /* use small pages for supporting non-pow2 (32M/40M/48M) DRAM phys page sizes */ 18 + #define DRAM_POOL_PAGE_SIZE SZ_8M 19 + 17 20 /* 18 21 * The va ranges in context object contain a list with the available chunks of 19 22 * device virtual memory. ··· 41 38 */ 42 39 43 40 /* 44 - * alloc_device_memory - allocate device memory 45 - * 46 - * @ctx : current context 47 - * @args : host parameters containing the requested size 48 - * @ret_handle : result handle 41 + * alloc_device_memory() - allocate device memory. 42 + * @ctx: pointer to the context structure. 43 + * @args: host parameters containing the requested size. 44 + * @ret_handle: result handle. 49 45 * 50 46 * This function does the following: 51 - * - Allocate the requested size rounded up to 'dram_page_size' pages 52 - * - Return unique handle 47 + * - Allocate the requested size rounded up to 'dram_page_size' pages. 48 + * - Return unique handle for later map/unmap/free. 53 49 */ 54 50 static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args, 55 51 u32 *ret_handle) ··· 57 55 struct hl_vm *vm = &hdev->vm; 58 56 struct hl_vm_phys_pg_pack *phys_pg_pack; 59 57 u64 paddr = 0, total_size, num_pgs, i; 60 - u32 num_curr_pgs, page_size, page_shift; 58 + u32 num_curr_pgs, page_size; 61 59 int handle, rc; 62 60 bool contiguous; 63 61 64 62 num_curr_pgs = 0; 65 63 page_size = hdev->asic_prop.dram_page_size; 66 - page_shift = __ffs(page_size); 67 - num_pgs = (args->alloc.mem_size + (page_size - 1)) >> page_shift; 68 - total_size = num_pgs << page_shift; 64 + num_pgs = DIV_ROUND_UP_ULL(args->alloc.mem_size, page_size); 65 + total_size = num_pgs * page_size; 69 66 70 67 if (!total_size) { 71 68 dev_err(hdev->dev, "Cannot allocate 0 bytes\n"); ··· 183 182 return rc; 184 183 } 185 184 186 - /* 187 - * dma_map_host_va - DMA mapping of the given host virtual address. 188 - * @hdev: habanalabs device structure 189 - * @addr: the host virtual address of the memory area 190 - * @size: the size of the memory area 191 - * @p_userptr: pointer to result userptr structure 185 + /** 186 + * dma_map_host_va() - DMA mapping of the given host virtual address. 187 + * @hdev: habanalabs device structure. 188 + * @addr: the host virtual address of the memory area. 189 + * @size: the size of the memory area. 190 + * @p_userptr: pointer to result userptr structure. 192 191 * 193 192 * This function does the following: 194 - * - Allocate userptr structure 195 - * - Pin the given host memory using the userptr structure 196 - * - Perform DMA mapping to have the DMA addresses of the pages 193 + * - Allocate userptr structure. 194 + * - Pin the given host memory using the userptr structure. 195 + * - Perform DMA mapping to have the DMA addresses of the pages. 197 196 */ 198 197 static int dma_map_host_va(struct hl_device *hdev, u64 addr, u64 size, 199 198 struct hl_userptr **p_userptr) ··· 237 236 return rc; 238 237 } 239 238 240 - /* 241 - * dma_unmap_host_va - DMA unmapping of the given host virtual address. 242 - * @hdev: habanalabs device structure 243 - * @userptr: userptr to free 239 + /** 240 + * dma_unmap_host_va() - DMA unmapping of the given host virtual address. 241 + * @hdev: habanalabs device structure. 242 + * @userptr: userptr to free. 244 243 * 245 244 * This function does the following: 246 - * - Unpins the physical pages 247 - * - Frees the userptr structure 245 + * - Unpins the physical pages. 246 + * - Frees the userptr structure. 248 247 */ 249 248 static void dma_unmap_host_va(struct hl_device *hdev, 250 249 struct hl_userptr *userptr) ··· 253 252 kfree(userptr); 254 253 } 255 254 256 - /* 257 - * dram_pg_pool_do_release - free DRAM pages pool 258 - * 259 - * @ref : pointer to reference object 255 + /** 256 + * dram_pg_pool_do_release() - free DRAM pages pool 257 + * @ref: pointer to reference object. 260 258 * 261 259 * This function does the following: 262 - * - Frees the idr structure of physical pages handles 263 - * - Frees the generic pool of DRAM physical pages 260 + * - Frees the idr structure of physical pages handles. 261 + * - Frees the generic pool of DRAM physical pages. 264 262 */ 265 263 static void dram_pg_pool_do_release(struct kref *ref) 266 264 { ··· 274 274 gen_pool_destroy(vm->dram_pg_pool); 275 275 } 276 276 277 - /* 278 - * free_phys_pg_pack - free physical page pack 279 - * @hdev: habanalabs device structure 280 - * @phys_pg_pack: physical page pack to free 277 + /** 278 + * free_phys_pg_pack() - free physical page pack. 279 + * @hdev: habanalabs device structure. 280 + * @phys_pg_pack: physical page pack to free. 281 281 * 282 282 * This function does the following: 283 283 * - For DRAM memory only, iterate over the pack and free each physical block 284 - * structure by returning it to the general pool 285 - * - Free the hl_vm_phys_pg_pack structure 284 + * structure by returning it to the general pool. 285 + * - Free the hl_vm_phys_pg_pack structure. 286 286 */ 287 287 static void free_phys_pg_pack(struct hl_device *hdev, 288 288 struct hl_vm_phys_pg_pack *phys_pg_pack) ··· 313 313 kfree(phys_pg_pack); 314 314 } 315 315 316 - /* 317 - * free_device_memory - free device memory 318 - * 319 - * @ctx : current context 320 - * @handle : handle of the memory chunk to free 316 + /** 317 + * free_device_memory() - free device memory. 318 + * @ctx: pointer to the context structure. 319 + * @args: host parameters containing the requested size. 321 320 * 322 321 * This function does the following: 323 - * - Free the device memory related to the given handle 322 + * - Free the device memory related to the given handle. 324 323 */ 325 - static int free_device_memory(struct hl_ctx *ctx, u32 handle) 324 + static int free_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args) 326 325 { 327 326 struct hl_device *hdev = ctx->hdev; 328 327 struct hl_vm *vm = &hdev->vm; 329 328 struct hl_vm_phys_pg_pack *phys_pg_pack; 329 + u32 handle = args->free.handle; 330 330 331 331 spin_lock(&vm->idr_lock); 332 332 phys_pg_pack = idr_find(&vm->phys_pg_pack_handles, handle); ··· 361 361 return 0; 362 362 } 363 363 364 - /* 365 - * clear_va_list_locked - free virtual addresses list 366 - * 367 - * @hdev : habanalabs device structure 368 - * @va_list : list of virtual addresses to free 364 + /** 365 + * clear_va_list_locked() - free virtual addresses list. 366 + * @hdev: habanalabs device structure. 367 + * @va_list: list of virtual addresses to free. 369 368 * 370 369 * This function does the following: 371 - * - Iterate over the list and free each virtual addresses block 370 + * - Iterate over the list and free each virtual addresses block. 372 371 * 373 - * This function should be called only when va_list lock is taken 372 + * This function should be called only when va_list lock is taken. 374 373 */ 375 374 static void clear_va_list_locked(struct hl_device *hdev, 376 375 struct list_head *va_list) ··· 382 383 } 383 384 } 384 385 385 - /* 386 - * print_va_list_locked - print virtual addresses list 387 - * 388 - * @hdev : habanalabs device structure 389 - * @va_list : list of virtual addresses to print 386 + /** 387 + * print_va_list_locked() - print virtual addresses list. 388 + * @hdev: habanalabs device structure. 389 + * @va_list: list of virtual addresses to print. 390 390 * 391 391 * This function does the following: 392 - * - Iterate over the list and print each virtual addresses block 392 + * - Iterate over the list and print each virtual addresses block. 393 393 * 394 - * This function should be called only when va_list lock is taken 394 + * This function should be called only when va_list lock is taken. 395 395 */ 396 396 static void print_va_list_locked(struct hl_device *hdev, 397 397 struct list_head *va_list) ··· 407 409 #endif 408 410 } 409 411 410 - /* 411 - * merge_va_blocks_locked - merge a virtual block if possible 412 - * 413 - * @hdev : pointer to the habanalabs device structure 414 - * @va_list : pointer to the virtual addresses block list 415 - * @va_block : virtual block to merge with adjacent blocks 412 + /** 413 + * merge_va_blocks_locked() - merge a virtual block if possible. 414 + * @hdev: pointer to the habanalabs device structure. 415 + * @va_list: pointer to the virtual addresses block list. 416 + * @va_block: virtual block to merge with adjacent blocks. 416 417 * 417 418 * This function does the following: 418 419 * - Merge the given blocks with the adjacent blocks if their virtual ranges 419 - * create a contiguous virtual range 420 + * create a contiguous virtual range. 420 421 * 421 - * This Function should be called only when va_list lock is taken 422 + * This Function should be called only when va_list lock is taken. 422 423 */ 423 424 static void merge_va_blocks_locked(struct hl_device *hdev, 424 425 struct list_head *va_list, struct hl_vm_va_block *va_block) ··· 442 445 } 443 446 } 444 447 445 - /* 446 - * add_va_block_locked - add a virtual block to the virtual addresses list 447 - * 448 - * @hdev : pointer to the habanalabs device structure 449 - * @va_list : pointer to the virtual addresses block list 450 - * @start : start virtual address 451 - * @end : end virtual address 448 + /** 449 + * add_va_block_locked() - add a virtual block to the virtual addresses list. 450 + * @hdev: pointer to the habanalabs device structure. 451 + * @va_list: pointer to the virtual addresses block list. 452 + * @start: start virtual address. 453 + * @end: end virtual address. 452 454 * 453 455 * This function does the following: 454 - * - Add the given block to the virtual blocks list and merge with other 455 - * blocks if a contiguous virtual block can be created 456 + * - Add the given block to the virtual blocks list and merge with other blocks 457 + * if a contiguous virtual block can be created. 456 458 * 457 - * This Function should be called only when va_list lock is taken 459 + * This Function should be called only when va_list lock is taken. 458 460 */ 459 461 static int add_va_block_locked(struct hl_device *hdev, 460 462 struct list_head *va_list, u64 start, u64 end) ··· 497 501 return 0; 498 502 } 499 503 500 - /* 501 - * add_va_block - wrapper for add_va_block_locked 502 - * 503 - * @hdev : pointer to the habanalabs device structure 504 - * @va_list : pointer to the virtual addresses block list 505 - * @start : start virtual address 506 - * @end : end virtual address 504 + /** 505 + * add_va_block() - wrapper for add_va_block_locked. 506 + * @hdev: pointer to the habanalabs device structure. 507 + * @va_list: pointer to the virtual addresses block list. 508 + * @start: start virtual address. 509 + * @end: end virtual address. 507 510 * 508 511 * This function does the following: 509 - * - Takes the list lock and calls add_va_block_locked 512 + * - Takes the list lock and calls add_va_block_locked. 510 513 */ 511 514 static inline int add_va_block(struct hl_device *hdev, 512 515 struct hl_va_range *va_range, u64 start, u64 end) ··· 519 524 return rc; 520 525 } 521 526 522 - /* 527 + /** 523 528 * get_va_block() - get a virtual block for the given size and alignment. 529 + * 524 530 * @hdev: pointer to the habanalabs device structure. 525 531 * @va_range: pointer to the virtual addresses range. 526 532 * @size: requested block size. ··· 530 534 * 531 535 * This function does the following: 532 536 * - Iterate on the virtual block list to find a suitable virtual block for the 533 - * given size and alignment. 537 + * given size, hint address and alignment. 534 538 * - Reserve the requested block and update the list. 535 539 * - Return the start address of the virtual block. 536 540 */ 537 - static u64 get_va_block(struct hl_device *hdev, struct hl_va_range *va_range, 538 - u64 size, u64 hint_addr, u32 va_block_align) 541 + static u64 get_va_block(struct hl_device *hdev, 542 + struct hl_va_range *va_range, 543 + u64 size, u64 hint_addr, u32 va_block_align) 539 544 { 540 545 struct hl_vm_va_block *va_block, *new_va_block = NULL; 541 - u64 valid_start, valid_size, prev_start, prev_end, align_mask, 542 - res_valid_start = 0, res_valid_size = 0; 546 + u64 tmp_hint_addr, valid_start, valid_size, prev_start, prev_end, 547 + align_mask, reserved_valid_start = 0, reserved_valid_size = 0; 543 548 bool add_prev = false; 549 + bool is_align_pow_2 = is_power_of_2(va_range->page_size); 544 550 545 - align_mask = ~((u64)va_block_align - 1); 551 + if (is_align_pow_2) 552 + align_mask = ~((u64)va_block_align - 1); 553 + else 554 + /* 555 + * with non-power-of-2 range we work only with page granularity 556 + * and the start address is page aligned, 557 + * so no need for alignment checking. 558 + */ 559 + size = DIV_ROUND_UP_ULL(size, va_range->page_size) * 560 + va_range->page_size; 546 561 547 - /* check if hint_addr is aligned */ 548 - if (hint_addr & (va_block_align - 1)) 562 + tmp_hint_addr = hint_addr; 563 + 564 + /* Check if we need to ignore hint address */ 565 + if ((is_align_pow_2 && (hint_addr & (va_block_align - 1))) || 566 + (!is_align_pow_2 && 567 + do_div(tmp_hint_addr, va_range->page_size))) { 568 + dev_info(hdev->dev, "Hint address 0x%llx will be ignored\n", 569 + hint_addr); 549 570 hint_addr = 0; 571 + } 550 572 551 573 mutex_lock(&va_range->lock); 552 574 553 575 print_va_list_locked(hdev, &va_range->list); 554 576 555 577 list_for_each_entry(va_block, &va_range->list, node) { 556 - /* calc the first possible aligned addr */ 578 + /* Calc the first possible aligned addr */ 557 579 valid_start = va_block->start; 558 580 559 - if (valid_start & (va_block_align - 1)) { 581 + if (is_align_pow_2 && (valid_start & (va_block_align - 1))) { 560 582 valid_start &= align_mask; 561 583 valid_start += va_block_align; 562 584 if (valid_start > va_block->end) ··· 582 568 } 583 569 584 570 valid_size = va_block->end - valid_start; 571 + if (valid_size < size) 572 + continue; 585 573 586 - if (valid_size >= size && 587 - (!new_va_block || valid_size < res_valid_size)) { 574 + /* Pick the minimal length block which has the required size */ 575 + if (!new_va_block || (valid_size < reserved_valid_size)) { 588 576 new_va_block = va_block; 589 - res_valid_start = valid_start; 590 - res_valid_size = valid_size; 577 + reserved_valid_start = valid_start; 578 + reserved_valid_size = valid_size; 591 579 } 592 580 593 581 if (hint_addr && hint_addr >= valid_start && 594 - ((hint_addr + size) <= va_block->end)) { 582 + (hint_addr + size) <= va_block->end) { 595 583 new_va_block = va_block; 596 - res_valid_start = hint_addr; 597 - res_valid_size = valid_size; 584 + reserved_valid_start = hint_addr; 585 + reserved_valid_size = valid_size; 598 586 break; 599 587 } 600 588 } 601 589 602 590 if (!new_va_block) { 603 591 dev_err(hdev->dev, "no available va block for size %llu\n", 604 - size); 592 + size); 605 593 goto out; 606 594 } 607 595 608 - if (res_valid_start > new_va_block->start) { 596 + /* 597 + * Check if there is some leftover range due to reserving the new 598 + * va block, then return it to the main virtual addresses list. 599 + */ 600 + if (reserved_valid_start > new_va_block->start) { 609 601 prev_start = new_va_block->start; 610 - prev_end = res_valid_start - 1; 602 + prev_end = reserved_valid_start - 1; 611 603 612 - new_va_block->start = res_valid_start; 613 - new_va_block->size = res_valid_size; 604 + new_va_block->start = reserved_valid_start; 605 + new_va_block->size = reserved_valid_size; 614 606 615 607 add_prev = true; 616 608 } ··· 637 617 out: 638 618 mutex_unlock(&va_range->lock); 639 619 640 - return res_valid_start; 620 + return reserved_valid_start; 641 621 } 642 622 643 623 /* ··· 664 644 665 645 /** 666 646 * hl_get_va_range_type() - get va_range type for the given address and size. 667 - * @address: The start address of the area we want to validate. 668 - * @size: The size in bytes of the area we want to validate. 669 - * @type: returned va_range type 647 + * @address: the start address of the area we want to validate. 648 + * @size: the size in bytes of the area we want to validate. 649 + * @type: returned va_range type. 670 650 * 671 651 * Return: true if the area is inside a valid range, false otherwise. 672 652 */ ··· 687 667 return -EINVAL; 688 668 } 689 669 690 - /* 691 - * hl_unreserve_va_block - wrapper for add_va_block for unreserving a va block 692 - * 670 + /** 671 + * hl_unreserve_va_block() - wrapper for add_va_block to unreserve a va block. 693 672 * @hdev: pointer to the habanalabs device structure 694 - * @ctx: current context 695 - * @start: start virtual address 696 - * @end: end virtual address 673 + * @ctx: pointer to the context structure. 674 + * @start: start virtual address. 675 + * @end: end virtual address. 697 676 * 698 677 * This function does the following: 699 - * - Takes the list lock and calls add_va_block_locked 678 + * - Takes the list lock and calls add_va_block_locked. 700 679 */ 701 680 int hl_unreserve_va_block(struct hl_device *hdev, struct hl_ctx *ctx, 702 681 u64 start_addr, u64 size) ··· 720 701 return rc; 721 702 } 722 703 723 - /* 724 - * get_sg_info - get number of pages and the DMA address from SG list 725 - * 726 - * @sg : the SG list 727 - * @dma_addr : pointer to DMA address to return 704 + /** 705 + * get_sg_info() - get number of pages and the DMA address from SG list. 706 + * @sg: the SG list. 707 + * @dma_addr: pointer to DMA address to return. 728 708 * 729 709 * Calculate the number of consecutive pages described by the SG list. Take the 730 710 * offset of the address in the first page, add to it the length and round it up ··· 737 719 (PAGE_SIZE - 1)) >> PAGE_SHIFT; 738 720 } 739 721 740 - /* 741 - * init_phys_pg_pack_from_userptr - initialize physical page pack from host 742 - * memory 743 - * @ctx: current context 744 - * @userptr: userptr to initialize from 745 - * @pphys_pg_pack: result pointer 722 + /** 723 + * init_phys_pg_pack_from_userptr() - initialize physical page pack from host 724 + * memory 725 + * @ctx: pointer to the context structure. 726 + * @userptr: userptr to initialize from. 727 + * @pphys_pg_pack: result pointer. 746 728 * 747 729 * This function does the following: 748 - * - Pin the physical pages related to the given virtual block 730 + * - Pin the physical pages related to the given virtual block. 749 731 * - Create a physical page pack from the physical pages related to the given 750 - * virtual block 732 + * virtual block. 751 733 */ 752 734 static int init_phys_pg_pack_from_userptr(struct hl_ctx *ctx, 753 735 struct hl_userptr *userptr, ··· 839 821 return rc; 840 822 } 841 823 842 - /* 843 - * map_phys_pg_pack - maps the physical page pack. 844 - * @ctx: current context 845 - * @vaddr: start address of the virtual area to map from 846 - * @phys_pg_pack: the pack of physical pages to map to 824 + /** 825 + * map_phys_pg_pack() - maps the physical page pack.. 826 + * @ctx: pointer to the context structure. 827 + * @vaddr: start address of the virtual area to map from. 828 + * @phys_pg_pack: the pack of physical pages to map to. 847 829 * 848 830 * This function does the following: 849 - * - Maps each chunk of virtual memory to matching physical chunk 850 - * - Stores number of successful mappings in the given argument 851 - * - Returns 0 on success, error code otherwise 831 + * - Maps each chunk of virtual memory to matching physical chunk. 832 + * - Stores number of successful mappings in the given argument. 833 + * - Returns 0 on success, error code otherwise. 852 834 */ 853 835 static int map_phys_pg_pack(struct hl_ctx *ctx, u64 vaddr, 854 836 struct hl_vm_phys_pg_pack *phys_pg_pack) ··· 893 875 return rc; 894 876 } 895 877 896 - /* 897 - * unmap_phys_pg_pack - unmaps the physical page pack 898 - * @ctx: current context 899 - * @vaddr: start address of the virtual area to unmap 900 - * @phys_pg_pack: the pack of physical pages to unmap 878 + /** 879 + * unmap_phys_pg_pack() - unmaps the physical page pack. 880 + * @ctx: pointer to the context structure. 881 + * @vaddr: start address of the virtual area to unmap. 882 + * @phys_pg_pack: the pack of physical pages to unmap. 901 883 */ 902 884 static void unmap_phys_pg_pack(struct hl_ctx *ctx, u64 vaddr, 903 885 struct hl_vm_phys_pg_pack *phys_pg_pack) ··· 931 913 } 932 914 933 915 static int get_paddr_from_handle(struct hl_ctx *ctx, struct hl_mem_in *args, 934 - u64 *paddr) 916 + u64 *paddr) 935 917 { 936 918 struct hl_device *hdev = ctx->hdev; 937 919 struct hl_vm *vm = &hdev->vm; ··· 954 936 return 0; 955 937 } 956 938 957 - /* 958 - * map_device_va - map the given memory 959 - * 960 - * @ctx : current context 961 - * @args : host parameters with handle/host virtual address 962 - * @device_addr : pointer to result device virtual address 939 + /** 940 + * map_device_va() - map the given memory. 941 + * @ctx: pointer to the context structure. 942 + * @args: host parameters with handle/host virtual address. 943 + * @device_addr: pointer to result device virtual address. 963 944 * 964 945 * This function does the following: 965 946 * - If given a physical device memory handle, map to a device virtual block 966 - * and return the start address of this block 947 + * and return the start address of this block. 967 948 * - If given a host virtual address and size, find the related physical pages, 968 949 * map a device virtual block to this pages and return the start address of 969 - * this block 950 + * this block. 970 951 */ 971 952 static int map_device_va(struct hl_ctx *ctx, struct hl_mem_in *args, 972 953 u64 *device_addr) ··· 1051 1034 1052 1035 hint_addr = args->map_device.hint_addr; 1053 1036 1054 - /* DRAM VA alignment is the same as the DRAM page size */ 1037 + /* DRAM VA alignment is the same as the MMU page size */ 1055 1038 va_range = ctx->va_range[HL_VA_RANGE_TYPE_DRAM]; 1056 1039 va_block_align = hdev->asic_prop.dmmu.page_size; 1057 1040 } ··· 1142 1125 return rc; 1143 1126 } 1144 1127 1145 - /* 1146 - * unmap_device_va - unmap the given device virtual address 1147 - * 1148 - * @ctx : current context 1149 - * @vaddr : device virtual address to unmap 1150 - * @ctx_free : true if in context free flow, false otherwise. 1128 + /** 1129 + * unmap_device_va() - unmap the given device virtual address. 1130 + * @ctx: pointer to the context structure. 1131 + * @args: host parameters with device virtual address to unmap. 1132 + * @ctx_free: true if in context free flow, false otherwise. 1151 1133 * 1152 1134 * This function does the following: 1153 - * - Unmap the physical pages related to the given virtual address 1154 - * - return the device virtual block to the virtual block list 1135 + * - unmap the physical pages related to the given virtual address. 1136 + * - return the device virtual block to the virtual block list. 1155 1137 */ 1156 - static int unmap_device_va(struct hl_ctx *ctx, u64 vaddr, bool ctx_free) 1138 + static int unmap_device_va(struct hl_ctx *ctx, struct hl_mem_in *args, 1139 + bool ctx_free) 1157 1140 { 1158 1141 struct hl_device *hdev = ctx->hdev; 1142 + struct asic_fixed_properties *prop = &hdev->asic_prop; 1159 1143 struct hl_vm_phys_pg_pack *phys_pg_pack = NULL; 1160 1144 struct hl_vm_hash_node *hnode = NULL; 1161 1145 struct hl_userptr *userptr = NULL; 1162 1146 struct hl_va_range *va_range; 1147 + u64 vaddr = args->unmap.device_virt_addr; 1163 1148 enum vm_type_t *vm_type; 1164 1149 bool is_userptr; 1165 1150 int rc = 0; ··· 1220 1201 goto mapping_cnt_err; 1221 1202 } 1222 1203 1223 - vaddr &= ~(((u64) phys_pg_pack->page_size) - 1); 1204 + if (!is_userptr && !is_power_of_2(phys_pg_pack->page_size)) 1205 + vaddr = prop->dram_base_address + 1206 + DIV_ROUND_DOWN_ULL(vaddr - prop->dram_base_address, 1207 + phys_pg_pack->page_size) * 1208 + phys_pg_pack->page_size; 1209 + else 1210 + vaddr &= ~(((u64) phys_pg_pack->page_size) - 1); 1224 1211 1225 1212 mutex_lock(&ctx->mmu_lock); 1226 1213 ··· 1289 1264 return rc; 1290 1265 } 1291 1266 1267 + static int map_block(struct hl_device *hdev, u64 address, u64 *handle) 1268 + { 1269 + u32 block_id = 0; 1270 + int rc; 1271 + 1272 + rc = hdev->asic_funcs->get_hw_block_id(hdev, address, &block_id); 1273 + 1274 + *handle = block_id | HL_MMAP_TYPE_BLOCK; 1275 + *handle <<= PAGE_SHIFT; 1276 + 1277 + return rc; 1278 + } 1279 + 1280 + static void hw_block_vm_close(struct vm_area_struct *vma) 1281 + { 1282 + struct hl_ctx *ctx = (struct hl_ctx *) vma->vm_private_data; 1283 + 1284 + hl_ctx_put(ctx); 1285 + vma->vm_private_data = NULL; 1286 + } 1287 + 1288 + static const struct vm_operations_struct hw_block_vm_ops = { 1289 + .close = hw_block_vm_close 1290 + }; 1291 + 1292 + /** 1293 + * hl_hw_block_mmap() - mmap a hw block to user. 1294 + * @hpriv: pointer to the private data of the fd 1295 + * @vma: pointer to vm_area_struct of the process 1296 + * 1297 + * Driver increments context reference for every HW block mapped in order 1298 + * to prevent user from closing FD without unmapping first 1299 + */ 1300 + int hl_hw_block_mmap(struct hl_fpriv *hpriv, struct vm_area_struct *vma) 1301 + { 1302 + struct hl_device *hdev = hpriv->hdev; 1303 + u32 block_id, block_size; 1304 + int rc; 1305 + 1306 + /* We use the page offset to hold the block id and thus we need to clear 1307 + * it before doing the mmap itself 1308 + */ 1309 + block_id = vma->vm_pgoff; 1310 + vma->vm_pgoff = 0; 1311 + 1312 + /* Driver only allows mapping of a complete HW block */ 1313 + block_size = vma->vm_end - vma->vm_start; 1314 + 1315 + #ifdef _HAS_TYPE_ARG_IN_ACCESS_OK 1316 + if (!access_ok(VERIFY_WRITE, 1317 + (void __user *) (uintptr_t) vma->vm_start, block_size)) { 1318 + #else 1319 + if (!access_ok((void __user *) (uintptr_t) vma->vm_start, block_size)) { 1320 + #endif 1321 + dev_err(hdev->dev, 1322 + "user pointer is invalid - 0x%lx\n", 1323 + vma->vm_start); 1324 + 1325 + return -EINVAL; 1326 + } 1327 + 1328 + vma->vm_ops = &hw_block_vm_ops; 1329 + vma->vm_private_data = hpriv->ctx; 1330 + 1331 + hl_ctx_get(hdev, hpriv->ctx); 1332 + 1333 + rc = hdev->asic_funcs->hw_block_mmap(hdev, vma, block_id, block_size); 1334 + if (rc) { 1335 + hl_ctx_put(hpriv->ctx); 1336 + return rc; 1337 + } 1338 + 1339 + vma->vm_pgoff = block_id; 1340 + 1341 + return 0; 1342 + } 1343 + 1292 1344 static int mem_ioctl_no_mmu(struct hl_fpriv *hpriv, union hl_mem_args *args) 1293 1345 { 1294 1346 struct hl_device *hdev = hpriv->hdev; 1295 1347 struct hl_ctx *ctx = hpriv->ctx; 1296 - u64 device_addr = 0; 1348 + u64 block_handle, device_addr = 0; 1297 1349 u32 handle = 0; 1298 1350 int rc; 1299 1351 ··· 1394 1292 break; 1395 1293 1396 1294 case HL_MEM_OP_FREE: 1397 - rc = free_device_memory(ctx, args->in.free.handle); 1295 + rc = free_device_memory(ctx, &args->in); 1398 1296 break; 1399 1297 1400 1298 case HL_MEM_OP_MAP: ··· 1403 1301 rc = 0; 1404 1302 } else { 1405 1303 rc = get_paddr_from_handle(ctx, &args->in, 1406 - &device_addr); 1304 + &device_addr); 1407 1305 } 1408 1306 1409 1307 memset(args, 0, sizeof(*args)); ··· 1412 1310 1413 1311 case HL_MEM_OP_UNMAP: 1414 1312 rc = 0; 1313 + break; 1314 + 1315 + case HL_MEM_OP_MAP_BLOCK: 1316 + rc = map_block(hdev, args->in.map_block.block_addr, 1317 + &block_handle); 1318 + args->out.handle = block_handle; 1415 1319 break; 1416 1320 1417 1321 default: ··· 1436 1328 union hl_mem_args *args = data; 1437 1329 struct hl_device *hdev = hpriv->hdev; 1438 1330 struct hl_ctx *ctx = hpriv->ctx; 1439 - u64 device_addr = 0; 1331 + u64 block_handle, device_addr = 0; 1440 1332 u32 handle = 0; 1441 1333 int rc; 1442 1334 ··· 1508 1400 goto out; 1509 1401 } 1510 1402 1511 - rc = free_device_memory(ctx, args->in.free.handle); 1403 + rc = free_device_memory(ctx, &args->in); 1512 1404 break; 1513 1405 1514 1406 case HL_MEM_OP_MAP: ··· 1519 1411 break; 1520 1412 1521 1413 case HL_MEM_OP_UNMAP: 1522 - rc = unmap_device_va(ctx, args->in.unmap.device_virt_addr, 1523 - false); 1414 + rc = unmap_device_va(ctx, &args->in, false); 1415 + break; 1416 + 1417 + case HL_MEM_OP_MAP_BLOCK: 1418 + rc = map_block(hdev, args->in.map_block.block_addr, 1419 + &block_handle); 1420 + args->out.handle = block_handle; 1524 1421 break; 1525 1422 1526 1423 default: ··· 1591 1478 return rc; 1592 1479 } 1593 1480 1594 - /* 1595 - * hl_pin_host_memory - pins a chunk of host memory. 1596 - * @hdev: pointer to the habanalabs device structure 1597 - * @addr: the host virtual address of the memory area 1598 - * @size: the size of the memory area 1599 - * @userptr: pointer to hl_userptr structure 1481 + /** 1482 + * hl_pin_host_memory() - pins a chunk of host memory. 1483 + * @hdev: pointer to the habanalabs device structure. 1484 + * @addr: the host virtual address of the memory area. 1485 + * @size: the size of the memory area. 1486 + * @userptr: pointer to hl_userptr structure. 1600 1487 * 1601 1488 * This function does the following: 1602 - * - Pins the physical pages 1603 - * - Create an SG list from those pages 1489 + * - Pins the physical pages. 1490 + * - Create an SG list from those pages. 1604 1491 */ 1605 1492 int hl_pin_host_memory(struct hl_device *hdev, u64 addr, u64 size, 1606 1493 struct hl_userptr *userptr) ··· 1698 1585 kfree(userptr->sgt); 1699 1586 } 1700 1587 1701 - /* 1702 - * hl_userptr_delete_list - clear userptr list 1703 - * 1704 - * @hdev : pointer to the habanalabs device structure 1705 - * @userptr_list : pointer to the list to clear 1588 + /** 1589 + * hl_userptr_delete_list() - clear userptr list. 1590 + * @hdev: pointer to the habanalabs device structure. 1591 + * @userptr_list: pointer to the list to clear. 1706 1592 * 1707 1593 * This function does the following: 1708 1594 * - Iterates over the list and unpins the host memory and frees the userptr ··· 1720 1608 INIT_LIST_HEAD(userptr_list); 1721 1609 } 1722 1610 1723 - /* 1724 - * hl_userptr_is_pinned - returns whether the given userptr is pinned 1725 - * 1726 - * @hdev : pointer to the habanalabs device structure 1727 - * @userptr_list : pointer to the list to clear 1728 - * @userptr : pointer to userptr to check 1611 + /** 1612 + * hl_userptr_is_pinned() - returns whether the given userptr is pinned. 1613 + * @hdev: pointer to the habanalabs device structure. 1614 + * @userptr_list: pointer to the list to clear. 1615 + * @userptr: pointer to userptr to check. 1729 1616 * 1730 1617 * This function does the following: 1731 1618 * - Iterates over the list and checks if the given userptr is in it, means is ··· 1742 1631 return false; 1743 1632 } 1744 1633 1745 - /* 1746 - * va_range_init - initialize virtual addresses range 1747 - * @hdev: pointer to the habanalabs device structure 1748 - * @va_range: pointer to the range to initialize 1749 - * @start: range start address 1750 - * @end: range end address 1634 + /** 1635 + * va_range_init() - initialize virtual addresses range. 1636 + * @hdev: pointer to the habanalabs device structure. 1637 + * @va_range: pointer to the range to initialize. 1638 + * @start: range start address. 1639 + * @end: range end address. 1751 1640 * 1752 1641 * This function does the following: 1753 1642 * - Initializes the virtual addresses list of the given range with the given ··· 1760 1649 1761 1650 INIT_LIST_HEAD(&va_range->list); 1762 1651 1763 - /* PAGE_SIZE alignment */ 1652 + /* 1653 + * PAGE_SIZE alignment 1654 + * it is the callers responsibility to align the addresses if the 1655 + * page size is not a power of 2 1656 + */ 1764 1657 1765 - if (start & (PAGE_SIZE - 1)) { 1766 - start &= PAGE_MASK; 1767 - start += PAGE_SIZE; 1658 + if (is_power_of_2(page_size)) { 1659 + if (start & (PAGE_SIZE - 1)) { 1660 + start &= PAGE_MASK; 1661 + start += PAGE_SIZE; 1662 + } 1663 + 1664 + if (end & (PAGE_SIZE - 1)) 1665 + end &= PAGE_MASK; 1768 1666 } 1769 - 1770 - if (end & (PAGE_SIZE - 1)) 1771 - end &= PAGE_MASK; 1772 1667 1773 1668 if (start >= end) { 1774 1669 dev_err(hdev->dev, "too small vm range for va list\n"); ··· 1795 1678 return 0; 1796 1679 } 1797 1680 1798 - /* 1799 - * va_range_fini() - clear a virtual addresses range 1800 - * @hdev: pointer to the habanalabs structure 1801 - * va_range: pointer to virtual addresses range 1681 + /** 1682 + * va_range_fini() - clear a virtual addresses range. 1683 + * @hdev: pointer to the habanalabs structure. 1684 + * va_range: pointer to virtual addresses rang.e 1802 1685 * 1803 1686 * This function does the following: 1804 - * - Frees the virtual addresses block list and its lock 1687 + * - Frees the virtual addresses block list and its lock. 1805 1688 */ 1806 1689 static void va_range_fini(struct hl_device *hdev, struct hl_va_range *va_range) 1807 1690 { ··· 1813 1696 kfree(va_range); 1814 1697 } 1815 1698 1816 - /* 1817 - * vm_ctx_init_with_ranges() - initialize virtual memory for context 1818 - * @ctx: pointer to the habanalabs context structure 1699 + /** 1700 + * vm_ctx_init_with_ranges() - initialize virtual memory for context. 1701 + * @ctx: pointer to the habanalabs context structure. 1819 1702 * @host_range_start: host virtual addresses range start. 1820 1703 * @host_range_end: host virtual addresses range end. 1821 1704 * @host_huge_range_start: host virtual addresses range start for memory 1822 - * allocated with huge pages. 1705 + * allocated with huge pages. 1823 1706 * @host_huge_range_end: host virtual addresses range end for memory allocated 1824 1707 * with huge pages. 1825 1708 * @dram_range_start: dram virtual addresses range start. 1826 1709 * @dram_range_end: dram virtual addresses range end. 1827 1710 * 1828 1711 * This function initializes the following: 1829 - * - MMU for context 1830 - * - Virtual address to area descriptor hashtable 1831 - * - Virtual block list of available virtual memory 1712 + * - MMU for context. 1713 + * - Virtual address to area descriptor hashtable. 1714 + * - Virtual block list of available virtual memory. 1832 1715 */ 1833 1716 static int vm_ctx_init_with_ranges(struct hl_ctx *ctx, 1834 1717 u64 host_range_start, ··· 1949 1832 1950 1833 dram_range_start = prop->dmmu.start_addr; 1951 1834 dram_range_end = prop->dmmu.end_addr; 1952 - dram_page_size = prop->dmmu.page_size; 1835 + dram_page_size = prop->dram_page_size ? 1836 + prop->dram_page_size : prop->dmmu.page_size; 1953 1837 host_range_start = prop->pmmu.start_addr; 1954 1838 host_range_end = prop->pmmu.end_addr; 1955 1839 host_page_size = prop->pmmu.page_size; ··· 1964 1846 dram_range_start, dram_range_end, dram_page_size); 1965 1847 } 1966 1848 1967 - /* 1968 - * hl_vm_ctx_fini - virtual memory teardown of context 1969 - * 1970 - * @ctx : pointer to the habanalabs context structure 1849 + /** 1850 + * hl_vm_ctx_fini() - virtual memory teardown of context. 1851 + * @ctx: pointer to the habanalabs context structure. 1971 1852 * 1972 1853 * This function perform teardown the following: 1973 - * - Virtual block list of available virtual memory 1974 - * - Virtual address to area descriptor hashtable 1975 - * - MMU for context 1854 + * - Virtual block list of available virtual memory. 1855 + * - Virtual address to area descriptor hashtable. 1856 + * - MMU for context. 1976 1857 * 1977 1858 * In addition this function does the following: 1978 1859 * - Unmaps the existing hashtable nodes if the hashtable is not empty. The ··· 1990 1873 struct hl_vm_phys_pg_pack *phys_pg_list; 1991 1874 struct hl_vm_hash_node *hnode; 1992 1875 struct hlist_node *tmp_node; 1876 + struct hl_mem_in args; 1993 1877 int i; 1994 1878 1995 - if (!ctx->hdev->mmu_enable) 1879 + if (!hdev->mmu_enable) 1996 1880 return; 1997 1881 1998 1882 hl_debugfs_remove_ctx_mem_hash(hdev, ctx); ··· 2010 1892 dev_dbg(hdev->dev, 2011 1893 "hl_mem_hash_node of vaddr 0x%llx of asid %d is still alive\n", 2012 1894 hnode->vaddr, ctx->asid); 2013 - unmap_device_va(ctx, hnode->vaddr, true); 1895 + args.unmap.device_virt_addr = hnode->vaddr; 1896 + unmap_device_va(ctx, &args, true); 2014 1897 } 1898 + 1899 + mutex_lock(&ctx->mmu_lock); 2015 1900 2016 1901 /* invalidate the cache once after the unmapping loop */ 2017 1902 hdev->asic_funcs->mmu_invalidate_cache(hdev, true, VM_TYPE_USERPTR); 2018 1903 hdev->asic_funcs->mmu_invalidate_cache(hdev, true, VM_TYPE_PHYS_PACK); 1904 + 1905 + mutex_unlock(&ctx->mmu_lock); 2019 1906 2020 1907 spin_lock(&vm->idr_lock); 2021 1908 idr_for_each_entry(&vm->phys_pg_pack_handles, phys_pg_list, i) ··· 2048 1925 * because the user notifies us on allocations. If the user is no more, 2049 1926 * all DRAM is available 2050 1927 */ 2051 - if (!ctx->hdev->asic_prop.dram_supports_virtual_memory) 2052 - atomic64_set(&ctx->hdev->dram_used_mem, 0); 1928 + if (ctx->asid != HL_KERNEL_ASID_ID && 1929 + !hdev->asic_prop.dram_supports_virtual_memory) 1930 + atomic64_set(&hdev->dram_used_mem, 0); 2053 1931 } 2054 1932 2055 - /* 2056 - * hl_vm_init - initialize virtual memory module 2057 - * 2058 - * @hdev : pointer to the habanalabs device structure 1933 + /** 1934 + * hl_vm_init() - initialize virtual memory module. 1935 + * @hdev: pointer to the habanalabs device structure. 2059 1936 * 2060 1937 * This function initializes the following: 2061 - * - MMU module 2062 - * - DRAM physical pages pool of 2MB 2063 - * - Idr for device memory allocation handles 1938 + * - MMU module. 1939 + * - DRAM physical pages pool of 2MB. 1940 + * - Idr for device memory allocation handles. 2064 1941 */ 2065 1942 int hl_vm_init(struct hl_device *hdev) 2066 1943 { ··· 2068 1945 struct hl_vm *vm = &hdev->vm; 2069 1946 int rc; 2070 1947 2071 - vm->dram_pg_pool = gen_pool_create(__ffs(prop->dram_page_size), -1); 1948 + if (is_power_of_2(prop->dram_page_size)) 1949 + vm->dram_pg_pool = 1950 + gen_pool_create(__ffs(prop->dram_page_size), -1); 1951 + else 1952 + vm->dram_pg_pool = 1953 + gen_pool_create(__ffs(DRAM_POOL_PAGE_SIZE), -1); 1954 + 2072 1955 if (!vm->dram_pg_pool) { 2073 1956 dev_err(hdev->dev, "Failed to create dram page pool\n"); 2074 1957 return -ENOMEM; ··· 2107 1978 return rc; 2108 1979 } 2109 1980 2110 - /* 2111 - * hl_vm_fini - virtual memory module teardown 2112 - * 2113 - * @hdev : pointer to the habanalabs device structure 1981 + /** 1982 + * hl_vm_fini() - virtual memory module teardown. 1983 + * @hdev: pointer to the habanalabs device structure. 2114 1984 * 2115 1985 * This function perform teardown to the following: 2116 - * - Idr for device memory allocation handles 2117 - * - DRAM physical pages pool of 2MB 2118 - * - MMU module 1986 + * - Idr for device memory allocation handles. 1987 + * - DRAM physical pages pool of 2MB. 1988 + * - MMU module. 2119 1989 */ 2120 1990 void hl_vm_fini(struct hl_device *hdev) 2121 1991 {

+111 -13

drivers/misc/habanalabs/common/mmu.c drivers/misc/habanalabs/common/mmu/mmu.c

··· 7 7 8 8 #include <linux/slab.h> 9 9 10 - #include "habanalabs.h" 10 + #include "../habanalabs.h" 11 11 12 12 bool hl_is_dram_va(struct hl_device *hdev, u64 virt_addr) 13 13 { ··· 166 166 mmu_prop = &prop->pmmu; 167 167 168 168 pgt_residency = mmu_prop->host_resident ? MMU_HR_PGT : MMU_DR_PGT; 169 - 170 169 /* 171 170 * The H/W handles mapping of specific page sizes. Hence if the page 172 171 * size is bigger, we break it to sub-pages and unmap them separately. ··· 173 174 if ((page_size % mmu_prop->page_size) == 0) { 174 175 real_page_size = mmu_prop->page_size; 175 176 } else { 176 - dev_err(hdev->dev, 177 - "page size of %u is not %uKB aligned, can't unmap\n", 178 - page_size, mmu_prop->page_size >> 10); 177 + /* 178 + * MMU page size may differ from DRAM page size. 179 + * In such case work with the DRAM page size and let the MMU 180 + * scrambling routine to handle this mismatch when 181 + * calculating the address to remove from the MMU page table 182 + */ 183 + if (is_dram_addr && ((page_size % prop->dram_page_size) == 0)) { 184 + real_page_size = prop->dram_page_size; 185 + } else { 186 + dev_err(hdev->dev, 187 + "page size of %u is not %uKB aligned, can't unmap\n", 188 + page_size, mmu_prop->page_size >> 10); 179 189 180 - return -EFAULT; 190 + return -EFAULT; 191 + } 181 192 } 182 193 183 194 npages = page_size / real_page_size; ··· 262 253 */ 263 254 if ((page_size % mmu_prop->page_size) == 0) { 264 255 real_page_size = mmu_prop->page_size; 256 + } else if (is_dram_addr && ((page_size % prop->dram_page_size) == 0) && 257 + (prop->dram_page_size < mmu_prop->page_size)) { 258 + /* 259 + * MMU page size may differ from DRAM page size. 260 + * In such case work with the DRAM page size and let the MMU 261 + * scrambling routine handle this mismatch when calculating 262 + * the address to place in the MMU page table. (in that case 263 + * also make sure that the dram_page_size smaller than the 264 + * mmu page size) 265 + */ 266 + real_page_size = prop->dram_page_size; 265 267 } else { 266 268 dev_err(hdev->dev, 267 269 "page size of %u is not %uKB aligned, can't map\n", ··· 281 261 return -EFAULT; 282 262 } 283 263 284 - WARN_ONCE((phys_addr & (real_page_size - 1)), 285 - "Mapping 0x%llx with page size of 0x%x is erroneous! Address must be divisible by page size", 286 - phys_addr, real_page_size); 264 + /* 265 + * Verify that the phys and virt addresses are aligned with the 266 + * MMU page size (in dram this means checking the address and MMU 267 + * after scrambling) 268 + */ 269 + if ((is_dram_addr && 270 + ((hdev->asic_funcs->scramble_addr(hdev, phys_addr) & 271 + (mmu_prop->page_size - 1)) || 272 + (hdev->asic_funcs->scramble_addr(hdev, virt_addr) & 273 + (mmu_prop->page_size - 1)))) || 274 + (!is_dram_addr && ((phys_addr & (real_page_size - 1)) || 275 + (virt_addr & (real_page_size - 1))))) 276 + dev_crit(hdev->dev, 277 + "Mapping address 0x%llx with virtual address 0x%llx and page size of 0x%x is erroneous! Addresses must be divisible by page size", 278 + phys_addr, virt_addr, real_page_size); 287 279 288 280 npages = page_size / real_page_size; 289 281 real_virt_addr = virt_addr; ··· 476 444 hdev->mmu_func[MMU_HR_PGT].swap_in(ctx); 477 445 } 478 446 447 + static void hl_mmu_pa_page_with_offset(struct hl_ctx *ctx, u64 virt_addr, 448 + struct hl_mmu_hop_info *hops, 449 + u64 *phys_addr) 450 + { 451 + struct hl_device *hdev = ctx->hdev; 452 + struct asic_fixed_properties *prop = &hdev->asic_prop; 453 + u64 offset_mask, addr_mask, hop_shift, tmp_phys_addr; 454 + u32 hop0_shift_off; 455 + void *p; 456 + 457 + /* last hop holds the phys address and flags */ 458 + if (hops->unscrambled_paddr) 459 + tmp_phys_addr = hops->unscrambled_paddr; 460 + else 461 + tmp_phys_addr = hops->hop_info[hops->used_hops - 1].hop_pte_val; 462 + 463 + if (hops->range_type == HL_VA_RANGE_TYPE_HOST_HUGE) 464 + p = &prop->pmmu_huge; 465 + else if (hops->range_type == HL_VA_RANGE_TYPE_HOST) 466 + p = &prop->pmmu; 467 + else /* HL_VA_RANGE_TYPE_DRAM */ 468 + p = &prop->dmmu; 469 + 470 + /* 471 + * find the correct hop shift field in hl_mmu_properties structure 472 + * in order to determine the right maks for the page offset. 473 + */ 474 + hop0_shift_off = offsetof(struct hl_mmu_properties, hop0_shift); 475 + p = (char *)p + hop0_shift_off; 476 + p = (char *)p + ((hops->used_hops - 1) * sizeof(u64)); 477 + hop_shift = *(u64 *)p; 478 + offset_mask = (1 << hop_shift) - 1; 479 + addr_mask = ~(offset_mask); 480 + *phys_addr = (tmp_phys_addr & addr_mask) | 481 + (virt_addr & offset_mask); 482 + } 483 + 479 484 int hl_mmu_va_to_pa(struct hl_ctx *ctx, u64 virt_addr, u64 *phys_addr) 480 485 { 481 486 struct hl_mmu_hop_info hops; 482 - u64 tmp_addr; 483 487 int rc; 484 488 485 489 rc = hl_mmu_get_tlb_info(ctx, virt_addr, &hops); 486 490 if (rc) 487 491 return rc; 488 492 489 - /* last hop holds the phys address and flags */ 490 - tmp_addr = hops.hop_info[hops.used_hops - 1].hop_pte_val; 491 - *phys_addr = (tmp_addr & HOP_PHYS_ADDR_MASK) | (virt_addr & FLAGS_MASK); 493 + hl_mmu_pa_page_with_offset(ctx, virt_addr, &hops, phys_addr); 492 494 493 495 return 0; 494 496 } ··· 538 472 539 473 if (!hdev->mmu_enable) 540 474 return -EOPNOTSUPP; 475 + 476 + hops->scrambled_vaddr = virt_addr; /* assume no scrambling */ 541 477 542 478 is_dram_addr = hl_mem_area_inside_range(virt_addr, prop->dmmu.page_size, 543 479 prop->dmmu.start_addr, ··· 558 490 virt_addr, hops); 559 491 560 492 mutex_unlock(&ctx->mmu_lock); 493 + 494 + /* add page offset to physical address */ 495 + if (hops->unscrambled_paddr) 496 + hl_mmu_pa_page_with_offset(ctx, virt_addr, hops, 497 + &hops->unscrambled_paddr); 561 498 562 499 return rc; 563 500 } ··· 584 511 } 585 512 586 513 return 0; 514 + } 515 + 516 + /** 517 + * hl_mmu_scramble_addr() - The generic mmu address scrambling routine. 518 + * @hdev: pointer to device data. 519 + * @addr: The address to scramble. 520 + * 521 + * Return: The scrambled address. 522 + */ 523 + u64 hl_mmu_scramble_addr(struct hl_device *hdev, u64 addr) 524 + { 525 + return addr; 526 + } 527 + 528 + /** 529 + * hl_mmu_descramble_addr() - The generic mmu address descrambling 530 + * routine. 531 + * @hdev: pointer to device data. 532 + * @addr: The address to descramble. 533 + * 534 + * Return: The un-scrambled address. 535 + */ 536 + u64 hl_mmu_descramble_addr(struct hl_device *hdev, u64 addr) 537 + { 538 + return addr; 587 539 }

+2

drivers/misc/habanalabs/common/mmu/Makefile

··· 1 + # SPDX-License-Identifier: GPL-2.0-only 2 + HL_COMMON_MMU_FILES := common/mmu/mmu.o common/mmu/mmu_v1.o

+2 -2

drivers/misc/habanalabs/common/mmu_v1.c drivers/misc/habanalabs/common/mmu/mmu_v1.c

··· 5 5 * All Rights Reserved. 6 6 */ 7 7 8 - #include "habanalabs.h" 9 - #include "../include/hw_ip/mmu/mmu_general.h" 8 + #include "../habanalabs.h" 9 + #include "../../include/hw_ip/mmu/mmu_general.h" 10 10 11 11 #include <linux/slab.h> 12 12

+9 -38

drivers/misc/habanalabs/common/pci.c drivers/misc/habanalabs/common/pci/pci.c

··· 5 5 * All Rights Reserved. 6 6 */ 7 7 8 - #include "habanalabs.h" 9 - #include "../include/hw_ip/pci/pci_general.h" 8 + #include "../habanalabs.h" 9 + #include "../../include/hw_ip/pci/pci_general.h" 10 10 11 11 #include <linux/pci.h> 12 12 ··· 308 308 } 309 309 310 310 /** 311 - * hl_pci_set_dma_mask() - Set DMA masks for the device. 312 - * @hdev: Pointer to hl_device structure. 313 - * 314 - * This function sets the DMA masks (regular and consistent) for a specified 315 - * value. If it doesn't succeed, it tries to set it to a fall-back value 316 - * 317 - * Return: 0 on success, non-zero for failure. 318 - */ 319 - static int hl_pci_set_dma_mask(struct hl_device *hdev) 320 - { 321 - struct pci_dev *pdev = hdev->pdev; 322 - int rc; 323 - 324 - /* set DMA mask */ 325 - rc = pci_set_dma_mask(pdev, DMA_BIT_MASK(hdev->dma_mask)); 326 - if (rc) { 327 - dev_err(hdev->dev, 328 - "Failed to set pci dma mask to %d bits, error %d\n", 329 - hdev->dma_mask, rc); 330 - return rc; 331 - } 332 - 333 - rc = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(hdev->dma_mask)); 334 - if (rc) { 335 - dev_err(hdev->dev, 336 - "Failed to set pci consistent dma mask to %d bits, error %d\n", 337 - hdev->dma_mask, rc); 338 - return rc; 339 - } 340 - 341 - return 0; 342 - } 343 - 344 - /** 345 311 * hl_pci_init() - PCI initialization code. 346 312 * @hdev: Pointer to hl_device structure. 347 313 * ··· 343 377 goto unmap_pci_bars; 344 378 } 345 379 346 - rc = hl_pci_set_dma_mask(hdev); 347 - if (rc) 380 + rc = dma_set_mask_and_coherent(&pdev->dev, 381 + DMA_BIT_MASK(hdev->dma_mask)); 382 + if (rc) { 383 + dev_err(hdev->dev, 384 + "Failed to set dma mask to %d bits, error %d\n", 385 + hdev->dma_mask, rc); 348 386 goto unmap_pci_bars; 387 + } 349 388 350 389 return 0; 351 390

+2

drivers/misc/habanalabs/common/pci/Makefile

··· 1 + # SPDX-License-Identifier: GPL-2.0-only 2 + HL_COMMON_PCI_FILES := common/pci/pci.o

+338 -98

drivers/misc/habanalabs/gaudi/gaudi.c

··· 225 225 "MSG AXI LBW returned with error" 226 226 }; 227 227 228 + enum gaudi_sm_sei_cause { 229 + GAUDI_SM_SEI_SO_OVERFLOW, 230 + GAUDI_SM_SEI_LBW_4B_UNALIGNED, 231 + GAUDI_SM_SEI_AXI_RESPONSE_ERR 232 + }; 233 + 228 234 static enum hl_queue_type gaudi_queue_type[GAUDI_QUEUE_ID_SIZE] = { 229 235 QUEUE_TYPE_EXT, /* GAUDI_QUEUE_ID_DMA_0_0 */ 230 236 QUEUE_TYPE_EXT, /* GAUDI_QUEUE_ID_DMA_0_1 */ ··· 360 354 struct hl_cs_job *job); 361 355 static int gaudi_memset_device_memory(struct hl_device *hdev, u64 addr, 362 356 u32 size, u64 val); 357 + static int gaudi_memset_registers(struct hl_device *hdev, u64 reg_base, 358 + u32 num_regs, u32 val); 359 + static int gaudi_schedule_register_memset(struct hl_device *hdev, 360 + u32 hw_queue_id, u64 reg_base, u32 num_regs, u32 val); 363 361 static int gaudi_run_tpc_kernel(struct hl_device *hdev, u64 tpc_kernel, 364 362 u32 tpc_id); 365 363 static int gaudi_mmu_clear_pgt_range(struct hl_device *hdev); ··· 526 516 prop->first_available_user_mon[HL_GAUDI_WS_DCORE] = 527 517 prop->sync_stream_first_mon + 528 518 (num_sync_stream_queues * HL_RSVD_MONS); 519 + 520 + prop->first_available_user_msix_interrupt = USHRT_MAX; 529 521 530 522 /* disable fw security for now, set it in a later stage */ 531 523 prop->fw_security_disabled = true; ··· 925 913 struct gaudi_hw_sob_group *hw_sob_group = 926 914 container_of(ref, struct gaudi_hw_sob_group, kref); 927 915 struct hl_device *hdev = hw_sob_group->hdev; 928 - int i; 916 + u64 base_addr; 917 + int rc; 929 918 930 - for (i = 0 ; i < NUMBER_OF_SOBS_IN_GRP ; i++) 931 - WREG32(mmSYNC_MNGR_W_S_SYNC_MNGR_OBJS_SOB_OBJ_0 + 932 - (hw_sob_group->base_sob_id + i) * 4, 0); 919 + base_addr = CFG_BASE + mmSYNC_MNGR_W_S_SYNC_MNGR_OBJS_SOB_OBJ_0 + 920 + hw_sob_group->base_sob_id * 4; 921 + rc = gaudi_schedule_register_memset(hdev, hw_sob_group->queue_id, 922 + base_addr, NUMBER_OF_SOBS_IN_GRP, 0); 923 + if (rc) 924 + dev_err(hdev->dev, 925 + "failed resetting sob group - sob base %u, count %u", 926 + hw_sob_group->base_sob_id, NUMBER_OF_SOBS_IN_GRP); 933 927 934 928 kref_init(&hw_sob_group->kref); 935 929 } ··· 1025 1007 master_sob_base = 1026 1008 cprop->hw_sob_group[sob_group_offset].base_sob_id; 1027 1009 master_monitor = prop->collective_mstr_mon_id[0]; 1010 + 1011 + cprop->hw_sob_group[sob_group_offset].queue_id = queue_id; 1028 1012 1029 1013 dev_dbg(hdev->dev, 1030 1014 "Generate master wait CBs, sob %d (mask %#x), val:0x%x, mon %u, q %d\n", ··· 1268 1248 u32 queue_id, collective_queue, num_jobs; 1269 1249 u32 stream, nic_queue, nic_idx = 0; 1270 1250 bool skip; 1271 - int i, rc; 1251 + int i, rc = 0; 1272 1252 1273 1253 /* Verify wait queue id is configured as master */ 1274 1254 hw_queue_prop = &hdev->asic_prop.hw_queues_props[wait_queue_id]; ··· 1627 1607 1628 1608 hdev->supports_sync_stream = true; 1629 1609 hdev->supports_coresight = true; 1610 + hdev->supports_staged_submission = true; 1630 1611 1631 1612 return 0; 1632 1613 ··· 4539 4518 { 4540 4519 struct asic_fixed_properties *prop = &hdev->asic_prop; 4541 4520 struct gaudi_device *gaudi = hdev->asic_specific; 4542 - u64 idle_mask = 0; 4543 4521 int rc = 0; 4544 4522 u64 val = 0; 4545 4523 ··· 4551 4531 hdev, 4552 4532 mmDMA0_CORE_STS0/* dummy */, 4553 4533 val/* dummy */, 4554 - (hdev->asic_funcs->is_device_idle(hdev, 4555 - &idle_mask, NULL)), 4534 + (hdev->asic_funcs->is_device_idle(hdev, NULL, 4535 + 0, NULL)), 4556 4536 1000, 4557 4537 HBM_SCRUBBING_TIMEOUT_US); 4558 4538 if (rc) { ··· 5080 5060 * 1. A packet that will act as a completion packet 5081 5061 * 2. A packet that will generate MSI-X interrupt 5082 5062 */ 5083 - parser->patched_cb_size += sizeof(struct packet_msg_prot) * 2; 5063 + if (parser->completion) 5064 + parser->patched_cb_size += sizeof(struct packet_msg_prot) * 2; 5084 5065 5085 5066 return rc; 5086 5067 } ··· 5308 5287 * 1. A packet that will act as a completion packet 5309 5288 * 2. A packet that will generate MSI interrupt 5310 5289 */ 5311 - parser->patched_cb_size = parser->user_cb_size + 5312 - sizeof(struct packet_msg_prot) * 2; 5290 + if (parser->completion) 5291 + parser->patched_cb_size = parser->user_cb_size + 5292 + sizeof(struct packet_msg_prot) * 2; 5293 + else 5294 + parser->patched_cb_size = parser->user_cb_size; 5313 5295 5314 5296 rc = hl_cb_create(hdev, &hdev->kernel_cb_mgr, hdev->kernel_ctx, 5315 5297 parser->patched_cb_size, false, false, ··· 5328 5304 patched_cb_handle >>= PAGE_SHIFT; 5329 5305 parser->patched_cb = hl_cb_get(hdev, &hdev->kernel_cb_mgr, 5330 5306 (u32) patched_cb_handle); 5331 - /* hl_cb_get should never fail here so use kernel WARN */ 5332 - WARN(!parser->patched_cb, "DMA CB handle invalid 0x%x\n", 5333 - (u32) patched_cb_handle); 5307 + /* hl_cb_get should never fail */ 5334 5308 if (!parser->patched_cb) { 5309 + dev_crit(hdev->dev, "DMA CB handle invalid 0x%x\n", 5310 + (u32) patched_cb_handle); 5335 5311 rc = -EFAULT; 5336 5312 goto out; 5337 5313 } ··· 5400 5376 patched_cb_handle >>= PAGE_SHIFT; 5401 5377 parser->patched_cb = hl_cb_get(hdev, &hdev->kernel_cb_mgr, 5402 5378 (u32) patched_cb_handle); 5403 - /* hl_cb_get should never fail here so use kernel WARN */ 5404 - WARN(!parser->patched_cb, "DMA CB handle invalid 0x%x\n", 5405 - (u32) patched_cb_handle); 5379 + /* hl_cb_get should never fail here */ 5406 5380 if (!parser->patched_cb) { 5381 + dev_crit(hdev->dev, "DMA CB handle invalid 0x%x\n", 5382 + (u32) patched_cb_handle); 5407 5383 rc = -EFAULT; 5408 5384 goto out; 5409 5385 } ··· 5603 5579 return rc; 5604 5580 } 5605 5581 5606 - static void gaudi_restore_sm_registers(struct hl_device *hdev) 5582 + static int gaudi_memset_registers(struct hl_device *hdev, u64 reg_base, 5583 + u32 num_regs, u32 val) 5607 5584 { 5585 + struct packet_msg_long *pkt; 5586 + struct hl_cs_job *job; 5587 + u32 cb_size, ctl; 5588 + struct hl_cb *cb; 5589 + int i, rc; 5590 + 5591 + cb_size = (sizeof(*pkt) * num_regs) + sizeof(struct packet_msg_prot); 5592 + 5593 + if (cb_size > SZ_2M) { 5594 + dev_err(hdev->dev, "CB size must be smaller than %uMB", SZ_2M); 5595 + return -ENOMEM; 5596 + } 5597 + 5598 + cb = hl_cb_kernel_create(hdev, cb_size, false); 5599 + if (!cb) 5600 + return -EFAULT; 5601 + 5602 + pkt = cb->kernel_address; 5603 + 5604 + ctl = FIELD_PREP(GAUDI_PKT_LONG_CTL_OP_MASK, 0); /* write the value */ 5605 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_OPCODE_MASK, PACKET_MSG_LONG); 5606 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_EB_MASK, 1); 5607 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_RB_MASK, 1); 5608 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_MB_MASK, 1); 5609 + 5610 + for (i = 0; i < num_regs ; i++, pkt++) { 5611 + pkt->ctl = cpu_to_le32(ctl); 5612 + pkt->value = cpu_to_le32(val); 5613 + pkt->addr = cpu_to_le64(reg_base + (i * 4)); 5614 + } 5615 + 5616 + job = hl_cs_allocate_job(hdev, QUEUE_TYPE_EXT, true); 5617 + if (!job) { 5618 + dev_err(hdev->dev, "Failed to allocate a new job\n"); 5619 + rc = -ENOMEM; 5620 + goto release_cb; 5621 + } 5622 + 5623 + job->id = 0; 5624 + job->user_cb = cb; 5625 + atomic_inc(&job->user_cb->cs_cnt); 5626 + job->user_cb_size = cb_size; 5627 + job->hw_queue_id = GAUDI_QUEUE_ID_DMA_0_0; 5628 + job->patched_cb = job->user_cb; 5629 + job->job_cb_size = cb_size; 5630 + 5631 + hl_debugfs_add_job(hdev, job); 5632 + 5633 + rc = gaudi_send_job_on_qman0(hdev, job); 5634 + hl_debugfs_remove_job(hdev, job); 5635 + kfree(job); 5636 + atomic_dec(&cb->cs_cnt); 5637 + 5638 + release_cb: 5639 + hl_cb_put(cb); 5640 + hl_cb_destroy(hdev, &hdev->kernel_cb_mgr, cb->id << PAGE_SHIFT); 5641 + 5642 + return rc; 5643 + } 5644 + 5645 + static int gaudi_schedule_register_memset(struct hl_device *hdev, 5646 + u32 hw_queue_id, u64 reg_base, u32 num_regs, u32 val) 5647 + { 5648 + struct hl_ctx *ctx = hdev->compute_ctx; 5649 + struct hl_pending_cb *pending_cb; 5650 + struct packet_msg_long *pkt; 5651 + u32 cb_size, ctl; 5652 + struct hl_cb *cb; 5608 5653 int i; 5609 5654 5610 - for (i = 0 ; i < NUM_OF_SOB_IN_BLOCK << 2 ; i += 4) { 5611 - WREG32(mmSYNC_MNGR_E_N_SYNC_MNGR_OBJS_SOB_OBJ_0 + i, 0); 5612 - WREG32(mmSYNC_MNGR_E_S_SYNC_MNGR_OBJS_SOB_OBJ_0 + i, 0); 5613 - WREG32(mmSYNC_MNGR_W_N_SYNC_MNGR_OBJS_SOB_OBJ_0 + i, 0); 5655 + /* If no compute context available or context is going down 5656 + * memset registers directly 5657 + */ 5658 + if (!ctx || kref_read(&ctx->refcount) == 0) 5659 + return gaudi_memset_registers(hdev, reg_base, num_regs, val); 5660 + 5661 + cb_size = (sizeof(*pkt) * num_regs) + 5662 + sizeof(struct packet_msg_prot) * 2; 5663 + 5664 + if (cb_size > SZ_2M) { 5665 + dev_err(hdev->dev, "CB size must be smaller than %uMB", SZ_2M); 5666 + return -ENOMEM; 5614 5667 } 5615 5668 5616 - for (i = 0 ; i < NUM_OF_MONITORS_IN_BLOCK << 2 ; i += 4) { 5617 - WREG32(mmSYNC_MNGR_E_N_SYNC_MNGR_OBJS_MON_STATUS_0 + i, 0); 5618 - WREG32(mmSYNC_MNGR_E_S_SYNC_MNGR_OBJS_MON_STATUS_0 + i, 0); 5619 - WREG32(mmSYNC_MNGR_W_N_SYNC_MNGR_OBJS_MON_STATUS_0 + i, 0); 5669 + pending_cb = kzalloc(sizeof(*pending_cb), GFP_KERNEL); 5670 + if (!pending_cb) 5671 + return -ENOMEM; 5672 + 5673 + cb = hl_cb_kernel_create(hdev, cb_size, false); 5674 + if (!cb) { 5675 + kfree(pending_cb); 5676 + return -EFAULT; 5620 5677 } 5621 5678 5622 - i = GAUDI_FIRST_AVAILABLE_W_S_SYNC_OBJECT * 4; 5679 + pkt = cb->kernel_address; 5623 5680 5624 - for (; i < NUM_OF_SOB_IN_BLOCK << 2 ; i += 4) 5625 - WREG32(mmSYNC_MNGR_W_S_SYNC_MNGR_OBJS_SOB_OBJ_0 + i, 0); 5681 + ctl = FIELD_PREP(GAUDI_PKT_LONG_CTL_OP_MASK, 0); /* write the value */ 5682 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_OPCODE_MASK, PACKET_MSG_LONG); 5683 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_EB_MASK, 1); 5684 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_RB_MASK, 1); 5685 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_MB_MASK, 1); 5626 5686 5627 - i = GAUDI_FIRST_AVAILABLE_W_S_MONITOR * 4; 5687 + for (i = 0; i < num_regs ; i++, pkt++) { 5688 + pkt->ctl = cpu_to_le32(ctl); 5689 + pkt->value = cpu_to_le32(val); 5690 + pkt->addr = cpu_to_le64(reg_base + (i * 4)); 5691 + } 5628 5692 5629 - for (; i < NUM_OF_MONITORS_IN_BLOCK << 2 ; i += 4) 5630 - WREG32(mmSYNC_MNGR_W_S_SYNC_MNGR_OBJS_MON_STATUS_0 + i, 0); 5693 + hl_cb_destroy(hdev, &hdev->kernel_cb_mgr, cb->id << PAGE_SHIFT); 5694 + 5695 + pending_cb->cb = cb; 5696 + pending_cb->cb_size = cb_size; 5697 + /* The queue ID MUST be an external queue ID. Otherwise, we will 5698 + * have undefined behavior 5699 + */ 5700 + pending_cb->hw_queue_id = hw_queue_id; 5701 + 5702 + spin_lock(&ctx->pending_cb_lock); 5703 + list_add_tail(&pending_cb->cb_node, &ctx->pending_cb_list); 5704 + spin_unlock(&ctx->pending_cb_lock); 5705 + 5706 + return 0; 5707 + } 5708 + 5709 + static int gaudi_restore_sm_registers(struct hl_device *hdev) 5710 + { 5711 + u64 base_addr; 5712 + u32 num_regs; 5713 + int rc; 5714 + 5715 + base_addr = CFG_BASE + mmSYNC_MNGR_E_N_SYNC_MNGR_OBJS_SOB_OBJ_0; 5716 + num_regs = NUM_OF_SOB_IN_BLOCK; 5717 + rc = gaudi_memset_registers(hdev, base_addr, num_regs, 0); 5718 + if (rc) { 5719 + dev_err(hdev->dev, "failed resetting SM registers"); 5720 + return -ENOMEM; 5721 + } 5722 + 5723 + base_addr = CFG_BASE + mmSYNC_MNGR_E_S_SYNC_MNGR_OBJS_SOB_OBJ_0; 5724 + num_regs = NUM_OF_SOB_IN_BLOCK; 5725 + rc = gaudi_memset_registers(hdev, base_addr, num_regs, 0); 5726 + if (rc) { 5727 + dev_err(hdev->dev, "failed resetting SM registers"); 5728 + return -ENOMEM; 5729 + } 5730 + 5731 + base_addr = CFG_BASE + mmSYNC_MNGR_W_N_SYNC_MNGR_OBJS_SOB_OBJ_0; 5732 + num_regs = NUM_OF_SOB_IN_BLOCK; 5733 + rc = gaudi_memset_registers(hdev, base_addr, num_regs, 0); 5734 + if (rc) { 5735 + dev_err(hdev->dev, "failed resetting SM registers"); 5736 + return -ENOMEM; 5737 + } 5738 + 5739 + base_addr = CFG_BASE + mmSYNC_MNGR_E_N_SYNC_MNGR_OBJS_MON_STATUS_0; 5740 + num_regs = NUM_OF_MONITORS_IN_BLOCK; 5741 + rc = gaudi_memset_registers(hdev, base_addr, num_regs, 0); 5742 + if (rc) { 5743 + dev_err(hdev->dev, "failed resetting SM registers"); 5744 + return -ENOMEM; 5745 + } 5746 + 5747 + base_addr = CFG_BASE + mmSYNC_MNGR_E_S_SYNC_MNGR_OBJS_MON_STATUS_0; 5748 + num_regs = NUM_OF_MONITORS_IN_BLOCK; 5749 + rc = gaudi_memset_registers(hdev, base_addr, num_regs, 0); 5750 + if (rc) { 5751 + dev_err(hdev->dev, "failed resetting SM registers"); 5752 + return -ENOMEM; 5753 + } 5754 + 5755 + base_addr = CFG_BASE + mmSYNC_MNGR_W_N_SYNC_MNGR_OBJS_MON_STATUS_0; 5756 + num_regs = NUM_OF_MONITORS_IN_BLOCK; 5757 + rc = gaudi_memset_registers(hdev, base_addr, num_regs, 0); 5758 + if (rc) { 5759 + dev_err(hdev->dev, "failed resetting SM registers"); 5760 + return -ENOMEM; 5761 + } 5762 + 5763 + base_addr = CFG_BASE + mmSYNC_MNGR_W_S_SYNC_MNGR_OBJS_SOB_OBJ_0 + 5764 + (GAUDI_FIRST_AVAILABLE_W_S_SYNC_OBJECT * 4); 5765 + num_regs = NUM_OF_SOB_IN_BLOCK - GAUDI_FIRST_AVAILABLE_W_S_SYNC_OBJECT; 5766 + rc = gaudi_memset_registers(hdev, base_addr, num_regs, 0); 5767 + if (rc) { 5768 + dev_err(hdev->dev, "failed resetting SM registers"); 5769 + return -ENOMEM; 5770 + } 5771 + 5772 + base_addr = CFG_BASE + mmSYNC_MNGR_W_S_SYNC_MNGR_OBJS_MON_STATUS_0 + 5773 + (GAUDI_FIRST_AVAILABLE_W_S_MONITOR * 4); 5774 + num_regs = NUM_OF_MONITORS_IN_BLOCK - GAUDI_FIRST_AVAILABLE_W_S_MONITOR; 5775 + rc = gaudi_memset_registers(hdev, base_addr, num_regs, 0); 5776 + if (rc) { 5777 + dev_err(hdev->dev, "failed resetting SM registers"); 5778 + return -ENOMEM; 5779 + } 5780 + 5781 + return 0; 5631 5782 } 5632 5783 5633 5784 static void gaudi_restore_dma_registers(struct hl_device *hdev) ··· 5859 5660 } 5860 5661 } 5861 5662 5862 - static void gaudi_restore_user_registers(struct hl_device *hdev) 5663 + static int gaudi_restore_user_registers(struct hl_device *hdev) 5863 5664 { 5864 - gaudi_restore_sm_registers(hdev); 5665 + int rc; 5666 + 5667 + rc = gaudi_restore_sm_registers(hdev); 5668 + if (rc) 5669 + return rc; 5670 + 5865 5671 gaudi_restore_dma_registers(hdev); 5866 5672 gaudi_restore_qm_registers(hdev); 5673 + 5674 + return 0; 5867 5675 } 5868 5676 5869 5677 static int gaudi_context_switch(struct hl_device *hdev, u32 asid) 5870 5678 { 5871 - gaudi_restore_user_registers(hdev); 5872 - 5873 - return 0; 5679 + return gaudi_restore_user_registers(hdev); 5874 5680 } 5875 5681 5876 5682 static int gaudi_mmu_clear_pgt_range(struct hl_device *hdev) ··· 5934 5730 } 5935 5731 if (hbm_bar_addr == U64_MAX) 5936 5732 rc = -EIO; 5937 - } else if (addr >= HOST_PHYS_BASE && !iommu_present(&pci_bus_type)) { 5938 - *val = *(u32 *) phys_to_virt(addr - HOST_PHYS_BASE); 5939 5733 } else { 5940 5734 rc = -EFAULT; 5941 5735 } ··· 5979 5777 } 5980 5778 if (hbm_bar_addr == U64_MAX) 5981 5779 rc = -EIO; 5982 - } else if (addr >= HOST_PHYS_BASE && !iommu_present(&pci_bus_type)) { 5983 - *(u32 *) phys_to_virt(addr - HOST_PHYS_BASE) = val; 5984 5780 } else { 5985 5781 rc = -EFAULT; 5986 5782 } ··· 6028 5828 } 6029 5829 if (hbm_bar_addr == U64_MAX) 6030 5830 rc = -EIO; 6031 - } else if (addr >= HOST_PHYS_BASE && !iommu_present(&pci_bus_type)) { 6032 - *val = *(u64 *) phys_to_virt(addr - HOST_PHYS_BASE); 6033 5831 } else { 6034 5832 rc = -EFAULT; 6035 5833 } ··· 6076 5878 } 6077 5879 if (hbm_bar_addr == U64_MAX) 6078 5880 rc = -EIO; 6079 - } else if (addr >= HOST_PHYS_BASE && !iommu_present(&pci_bus_type)) { 6080 - *(u64 *) phys_to_virt(addr - HOST_PHYS_BASE) = val; 6081 5881 } else { 6082 5882 rc = -EFAULT; 6083 5883 } ··· 6120 5924 return; 6121 5925 6122 5926 if (asid & ~DMA0_QM_GLBL_NON_SECURE_PROPS_0_ASID_MASK) { 6123 - WARN(1, "asid %u is too big\n", asid); 5927 + dev_crit(hdev->dev, "asid %u is too big\n", asid); 6124 5928 return; 6125 5929 } 6126 5930 ··· 6423 6227 else 6424 6228 timeout = HL_DEVICE_TIMEOUT_USEC; 6425 6229 6426 - if (!hdev->asic_funcs->is_device_idle(hdev, NULL, NULL)) { 6230 + if (!hdev->asic_funcs->is_device_idle(hdev, NULL, 0, NULL)) { 6427 6231 dev_err_ratelimited(hdev->dev, 6428 6232 "Can't send driver job on QMAN0 because the device is not idle\n"); 6429 6233 return -EBUSY; ··· 6851 6655 qm_name, 6852 6656 gaudi_qman_arb_error_cause[j]); 6853 6657 } 6658 + } 6659 + } 6660 + 6661 + static void gaudi_print_sm_sei_info(struct hl_device *hdev, u16 event_type, 6662 + struct hl_eq_sm_sei_data *sei_data) 6663 + { 6664 + u32 index = event_type - GAUDI_EVENT_DMA_IF_SEI_0; 6665 + 6666 + switch (sei_data->sei_cause) { 6667 + case SM_SEI_SO_OVERFLOW: 6668 + dev_err(hdev->dev, 6669 + "SM %u SEI Error: SO %u overflow/underflow", 6670 + index, le32_to_cpu(sei_data->sei_log)); 6671 + break; 6672 + case SM_SEI_LBW_4B_UNALIGNED: 6673 + dev_err(hdev->dev, 6674 + "SM %u SEI Error: Unaligned 4B LBW access, monitor agent address low - %#x", 6675 + index, le32_to_cpu(sei_data->sei_log)); 6676 + break; 6677 + case SM_SEI_AXI_RESPONSE_ERR: 6678 + dev_err(hdev->dev, 6679 + "SM %u SEI Error: AXI ID %u response error", 6680 + index, le32_to_cpu(sei_data->sei_log)); 6681 + break; 6682 + default: 6683 + dev_err(hdev->dev, "Unknown SM SEI cause %u", 6684 + le32_to_cpu(sei_data->sei_log)); 6685 + break; 6854 6686 } 6855 6687 } 6856 6688 ··· 7377 7153 gaudi_hbm_read_interrupts(hdev, 7378 7154 gaudi_hbm_event_to_dev(event_type), 7379 7155 &eq_entry->hbm_ecc_data); 7156 + hl_fw_unmask_irq(hdev, event_type); 7380 7157 break; 7381 7158 7382 7159 case GAUDI_EVENT_TPC0_DEC: ··· 7506 7281 hl_fw_unmask_irq(hdev, event_type); 7507 7282 break; 7508 7283 7284 + case GAUDI_EVENT_DMA_IF_SEI_0 ... GAUDI_EVENT_DMA_IF_SEI_3: 7285 + gaudi_print_irq_info(hdev, event_type, false); 7286 + gaudi_print_sm_sei_info(hdev, event_type, 7287 + &eq_entry->sm_sei_data); 7288 + hl_fw_unmask_irq(hdev, event_type); 7289 + break; 7290 + 7509 7291 case GAUDI_EVENT_FIX_POWER_ENV_S ... GAUDI_EVENT_FIX_THERMAL_ENV_E: 7510 7292 gaudi_print_clk_change_info(hdev, event_type); 7511 7293 hl_fw_unmask_irq(hdev, event_type); ··· 7562 7330 else 7563 7331 timeout_usec = MMU_CONFIG_TIMEOUT_USEC; 7564 7332 7565 - mutex_lock(&hdev->mmu_cache_lock); 7566 - 7567 7333 /* L0 & L1 invalidation */ 7568 7334 WREG32(mmSTLB_INV_PS, 3); 7569 7335 WREG32(mmSTLB_CACHE_INV, gaudi->mmu_cache_inv_pi++); ··· 7576 7346 timeout_usec); 7577 7347 7578 7348 WREG32(mmSTLB_INV_SET, 0); 7579 - 7580 - mutex_unlock(&hdev->mmu_cache_lock); 7581 7349 7582 7350 if (rc) { 7583 7351 dev_err_ratelimited(hdev->dev, ··· 7598 7370 if (!(gaudi->hw_cap_initialized & HW_CAP_MMU) || 7599 7371 hdev->hard_reset_pending) 7600 7372 return 0; 7601 - 7602 - mutex_lock(&hdev->mmu_cache_lock); 7603 7373 7604 7374 if (hdev->pldm) 7605 7375 timeout_usec = GAUDI_PLDM_MMU_TIMEOUT_USEC; ··· 7625 7399 status == pi, 7626 7400 1000, 7627 7401 timeout_usec); 7628 - 7629 - mutex_unlock(&hdev->mmu_cache_lock); 7630 7402 7631 7403 if (rc) { 7632 7404 dev_err_ratelimited(hdev->dev, ··· 7687 7463 if (!(gaudi->hw_cap_initialized & HW_CAP_CPU_Q)) 7688 7464 return 0; 7689 7465 7690 - rc = hl_fw_cpucp_info_get(hdev, mmCPU_BOOT_DEV_STS0); 7466 + rc = hl_fw_cpucp_info_get(hdev, mmCPU_BOOT_DEV_STS0, mmCPU_BOOT_ERR0); 7691 7467 if (rc) 7692 7468 return rc; 7693 7469 ··· 7707 7483 return 0; 7708 7484 } 7709 7485 7710 - static bool gaudi_is_device_idle(struct hl_device *hdev, u64 *mask, 7711 - struct seq_file *s) 7486 + static bool gaudi_is_device_idle(struct hl_device *hdev, u64 *mask_arr, 7487 + u8 mask_len, struct seq_file *s) 7712 7488 { 7713 7489 struct gaudi_device *gaudi = hdev->asic_specific; 7714 7490 const char *fmt = "%-5d%-9s%#-14x%#-12x%#x\n"; 7715 7491 const char *mme_slave_fmt = "%-5d%-9s%-14s%-12s%#x\n"; 7716 7492 const char *nic_fmt = "%-5d%-9s%#-14x%#x\n"; 7493 + unsigned long *mask = (unsigned long *)mask_arr; 7717 7494 u32 qm_glbl_sts0, qm_cgm_sts, dma_core_sts0, tpc_cfg_sts, mme_arch_sts; 7718 7495 bool is_idle = true, is_eng_idle, is_slave; 7719 7496 u64 offset; ··· 7740 7515 IS_DMA_IDLE(dma_core_sts0); 7741 7516 is_idle &= is_eng_idle; 7742 7517 7743 - if (mask) 7744 - *mask |= ((u64) !is_eng_idle) << 7745 - (GAUDI_ENGINE_ID_DMA_0 + dma_id); 7518 + if (mask && !is_eng_idle) 7519 + set_bit(GAUDI_ENGINE_ID_DMA_0 + dma_id, mask); 7746 7520 if (s) 7747 7521 seq_printf(s, fmt, dma_id, 7748 7522 is_eng_idle ? "Y" : "N", qm_glbl_sts0, ··· 7762 7538 IS_TPC_IDLE(tpc_cfg_sts); 7763 7539 is_idle &= is_eng_idle; 7764 7540 7765 - if (mask) 7766 - *mask |= ((u64) !is_eng_idle) << 7767 - (GAUDI_ENGINE_ID_TPC_0 + i); 7541 + if (mask && !is_eng_idle) 7542 + set_bit(GAUDI_ENGINE_ID_TPC_0 + i, mask); 7768 7543 if (s) 7769 7544 seq_printf(s, fmt, i, 7770 7545 is_eng_idle ? "Y" : "N", ··· 7790 7567 7791 7568 is_idle &= is_eng_idle; 7792 7569 7793 - if (mask) 7794 - *mask |= ((u64) !is_eng_idle) << 7795 - (GAUDI_ENGINE_ID_MME_0 + i); 7570 + if (mask && !is_eng_idle) 7571 + set_bit(GAUDI_ENGINE_ID_MME_0 + i, mask); 7796 7572 if (s) { 7797 7573 if (!is_slave) 7798 7574 seq_printf(s, fmt, i, ··· 7817 7595 is_eng_idle = IS_QM_IDLE(qm_glbl_sts0, qm_cgm_sts); 7818 7596 is_idle &= is_eng_idle; 7819 7597 7820 - if (mask) 7821 - *mask |= ((u64) !is_eng_idle) << 7822 - (GAUDI_ENGINE_ID_NIC_0 + port); 7598 + if (mask && !is_eng_idle) 7599 + set_bit(GAUDI_ENGINE_ID_NIC_0 + port, mask); 7823 7600 if (s) 7824 7601 seq_printf(s, nic_fmt, port, 7825 7602 is_eng_idle ? "Y" : "N", ··· 7832 7611 is_eng_idle = IS_QM_IDLE(qm_glbl_sts0, qm_cgm_sts); 7833 7612 is_idle &= is_eng_idle; 7834 7613 7835 - if (mask) 7836 - *mask |= ((u64) !is_eng_idle) << 7837 - (GAUDI_ENGINE_ID_NIC_0 + port); 7614 + if (mask && !is_eng_idle) 7615 + set_bit(GAUDI_ENGINE_ID_NIC_0 + port, mask); 7838 7616 if (s) 7839 7617 seq_printf(s, nic_fmt, port, 7840 7618 is_eng_idle ? "Y" : "N", ··· 8096 7876 8097 7877 static int gaudi_ctx_init(struct hl_ctx *ctx) 8098 7878 { 7879 + if (ctx->asid == HL_KERNEL_ASID_ID) 7880 + return 0; 7881 + 8099 7882 gaudi_mmu_prepare(ctx->hdev, ctx->asid); 8100 7883 return gaudi_internal_cb_pool_init(ctx->hdev, ctx); 8101 7884 } 8102 7885 8103 7886 static void gaudi_ctx_fini(struct hl_ctx *ctx) 8104 7887 { 8105 - struct hl_device *hdev = ctx->hdev; 8106 - 8107 - /* Gaudi will NEVER support more then a single compute context. 8108 - * Therefore, don't clear anything unless it is the compute context 8109 - */ 8110 - if (hdev->compute_ctx != ctx) 7888 + if (ctx->asid == HL_KERNEL_ASID_ID) 8111 7889 return; 8112 7890 8113 7891 gaudi_internal_cb_pool_fini(ctx->hdev, ctx); ··· 8146 7928 ctl = FIELD_PREP(GAUDI_PKT_SHORT_CTL_ADDR_MASK, sob_id * 4); 8147 7929 ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_OP_MASK, 0); /* write the value */ 8148 7930 ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_BASE_MASK, 3); /* W_S SOB base */ 8149 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_OPCODE_MASK, PACKET_MSG_SHORT); 8150 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_EB_MASK, eb); 8151 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_RB_MASK, 1); 8152 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_MB_MASK, 1); 7931 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_OPCODE_MASK, PACKET_MSG_SHORT); 7932 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_EB_MASK, eb); 7933 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_RB_MASK, 1); 7934 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_MB_MASK, 1); 8153 7935 8154 7936 pkt->value = cpu_to_le32(value); 8155 7937 pkt->ctl = cpu_to_le32(ctl); ··· 8166 7948 8167 7949 ctl = FIELD_PREP(GAUDI_PKT_SHORT_CTL_ADDR_MASK, addr); 8168 7950 ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_BASE_MASK, 2); /* W_S MON base */ 8169 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_OPCODE_MASK, PACKET_MSG_SHORT); 8170 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_EB_MASK, 0); 8171 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_RB_MASK, 1); 8172 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_MB_MASK, 0); /* last pkt MB */ 7951 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_OPCODE_MASK, PACKET_MSG_SHORT); 7952 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_EB_MASK, 0); 7953 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_RB_MASK, 1); 7954 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_MB_MASK, 0); /* last pkt MB */ 8173 7955 8174 7956 pkt->value = cpu_to_le32(value); 8175 7957 pkt->ctl = cpu_to_le32(ctl); ··· 8215 7997 ctl = FIELD_PREP(GAUDI_PKT_SHORT_CTL_ADDR_MASK, msg_addr_offset); 8216 7998 ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_OP_MASK, 0); /* write the value */ 8217 7999 ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_BASE_MASK, 2); /* W_S MON base */ 8218 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_OPCODE_MASK, PACKET_MSG_SHORT); 8219 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_EB_MASK, 0); 8220 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_RB_MASK, 1); 8221 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_MB_MASK, 1); 8000 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_OPCODE_MASK, PACKET_MSG_SHORT); 8001 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_EB_MASK, 0); 8002 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_RB_MASK, 1); 8003 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_MB_MASK, 1); 8222 8004 8223 8005 pkt->value = cpu_to_le32(value); 8224 8006 pkt->ctl = cpu_to_le32(ctl); ··· 8236 8018 cfg |= FIELD_PREP(GAUDI_PKT_FENCE_CFG_TARGET_VAL_MASK, 1); 8237 8019 cfg |= FIELD_PREP(GAUDI_PKT_FENCE_CFG_ID_MASK, 2); 8238 8020 8239 - ctl = FIELD_PREP(GAUDI_PKT_FENCE_CTL_OPCODE_MASK, PACKET_FENCE); 8240 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_EB_MASK, 0); 8241 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_RB_MASK, 1); 8242 - ctl |= FIELD_PREP(GAUDI_PKT_SHORT_CTL_MB_MASK, 1); 8021 + ctl = FIELD_PREP(GAUDI_PKT_CTL_OPCODE_MASK, PACKET_FENCE); 8022 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_EB_MASK, 0); 8023 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_RB_MASK, 1); 8024 + ctl |= FIELD_PREP(GAUDI_PKT_CTL_MB_MASK, 1); 8243 8025 8244 8026 pkt->cfg = cpu_to_le32(cfg); 8245 8027 pkt->ctl = cpu_to_le32(ctl); ··· 8435 8217 static void gaudi_reset_sob(struct hl_device *hdev, void *data) 8436 8218 { 8437 8219 struct hl_hw_sob *hw_sob = (struct hl_hw_sob *) data; 8220 + int rc; 8438 8221 8439 8222 dev_dbg(hdev->dev, "reset SOB, q_idx: %d, sob_id: %d\n", hw_sob->q_idx, 8440 8223 hw_sob->sob_id); 8441 8224 8442 - WREG32(mmSYNC_MNGR_W_S_SYNC_MNGR_OBJS_SOB_OBJ_0 + hw_sob->sob_id * 4, 8443 - 0); 8225 + rc = gaudi_schedule_register_memset(hdev, hw_sob->q_idx, 8226 + CFG_BASE + mmSYNC_MNGR_W_S_SYNC_MNGR_OBJS_SOB_OBJ_0 + 8227 + hw_sob->sob_id * 4, 1, 0); 8228 + if (rc) 8229 + dev_err(hdev->dev, "failed resetting sob %u", hw_sob->sob_id); 8444 8230 8445 8231 kref_init(&hw_sob->kref); 8446 8232 } ··· 8466 8244 u64 device_time = ((u64) RREG32(mmPSOC_TIMESTAMP_CNTCVU)) << 32; 8467 8245 8468 8246 return device_time | RREG32(mmPSOC_TIMESTAMP_CNTCVL); 8247 + } 8248 + 8249 + static int gaudi_get_hw_block_id(struct hl_device *hdev, u64 block_addr, 8250 + u32 *block_id) 8251 + { 8252 + return -EPERM; 8253 + } 8254 + 8255 + static int gaudi_block_mmap(struct hl_device *hdev, 8256 + struct vm_area_struct *vma, 8257 + u32 block_id, u32 block_size) 8258 + { 8259 + return -EPERM; 8469 8260 } 8470 8261 8471 8262 static const struct hl_asic_funcs gaudi_funcs = { ··· 8557 8322 .set_dma_mask_from_fw = gaudi_set_dma_mask_from_fw, 8558 8323 .get_device_time = gaudi_get_device_time, 8559 8324 .collective_wait_init_cs = gaudi_collective_wait_init_cs, 8560 - .collective_wait_create_jobs = gaudi_collective_wait_create_jobs 8325 + .collective_wait_create_jobs = gaudi_collective_wait_create_jobs, 8326 + .scramble_addr = hl_mmu_scramble_addr, 8327 + .descramble_addr = hl_mmu_descramble_addr, 8328 + .ack_protection_bits_errors = gaudi_ack_protection_bits_errors, 8329 + .get_hw_block_id = gaudi_get_hw_block_id, 8330 + .hw_block_mmap = gaudi_block_mmap 8561 8331 }; 8562 8332 8563 8333 /**

+3

drivers/misc/habanalabs/gaudi/gaudiP.h

··· 251 251 * @hdev: habanalabs device structure. 252 252 * @kref: refcount of this SOB group. group will reset once refcount is zero. 253 253 * @base_sob_id: base sob id of this SOB group. 254 + * @queue_id: id of the queue that waits on this sob group 254 255 */ 255 256 struct gaudi_hw_sob_group { 256 257 struct hl_device *hdev; 257 258 struct kref kref; 258 259 u32 base_sob_id; 260 + u32 queue_id; 259 261 }; 260 262 261 263 #define NUM_SOB_GROUPS (HL_RSVD_SOBS * QMAN_STREAMS) ··· 335 333 }; 336 334 337 335 void gaudi_init_security(struct hl_device *hdev); 336 + void gaudi_ack_protection_bits_errors(struct hl_device *hdev); 338 337 void gaudi_add_device_attr(struct hl_device *hdev, 339 338 struct attribute_group *dev_attr_grp); 340 339 void gaudi_set_pll_profile(struct hl_device *hdev, enum hl_pll_frequency freq);

+15 -3

drivers/misc/habanalabs/gaudi/gaudi_coresight.c

··· 634 634 WREG32(mmPSOC_ETR_BUFWM, 0x3FFC); 635 635 WREG32(mmPSOC_ETR_RSZ, input->buffer_size); 636 636 WREG32(mmPSOC_ETR_MODE, input->sink_mode); 637 - /* Workaround for H3 #HW-2075 bug: use small data chunks */ 638 - WREG32(mmPSOC_ETR_AXICTL, (is_host ? 0 : 0x700) | 639 - PSOC_ETR_AXICTL_PROTCTRLBIT1_SHIFT); 637 + if (hdev->asic_prop.fw_security_disabled) { 638 + /* make ETR not privileged */ 639 + val = FIELD_PREP( 640 + PSOC_ETR_AXICTL_PROTCTRLBIT0_MASK, 0); 641 + /* make ETR non-secured (inverted logic) */ 642 + val |= FIELD_PREP( 643 + PSOC_ETR_AXICTL_PROTCTRLBIT1_MASK, 1); 644 + /* 645 + * Workaround for H3 #HW-2075 bug: use small data 646 + * chunks 647 + */ 648 + val |= FIELD_PREP(PSOC_ETR_AXICTL_WRBURSTLEN_MASK, 649 + is_host ? 0 : 7); 650 + WREG32(mmPSOC_ETR_AXICTL, val); 651 + } 640 652 WREG32(mmPSOC_ETR_DBALO, 641 653 lower_32_bits(input->buffer_address)); 642 654 WREG32(mmPSOC_ETR_DBAHI,

+5

drivers/misc/habanalabs/gaudi/gaudi_security.c

··· 13052 13052 13053 13053 gaudi_init_protection_bits(hdev); 13054 13054 } 13055 + 13056 + void gaudi_ack_protection_bits_errors(struct hl_device *hdev) 13057 + { 13058 + 13059 + }

+41 -42

drivers/misc/habanalabs/goya/goya.c

··· 455 455 456 456 prop->max_pending_cs = GOYA_MAX_PENDING_CS; 457 457 458 + prop->first_available_user_msix_interrupt = USHRT_MAX; 459 + 458 460 /* disable fw security for now, set it in a later stage */ 459 461 prop->fw_security_disabled = true; 460 462 prop->fw_security_status_valid = false; ··· 2916 2914 else 2917 2915 timeout = HL_DEVICE_TIMEOUT_USEC; 2918 2916 2919 - if (!hdev->asic_funcs->is_device_idle(hdev, NULL, NULL)) { 2917 + if (!hdev->asic_funcs->is_device_idle(hdev, NULL, 0, NULL)) { 2920 2918 dev_err_ratelimited(hdev->dev, 2921 2919 "Can't send driver job on QMAN0 because the device is not idle\n"); 2922 2920 return -EBUSY; ··· 3878 3876 patched_cb_handle >>= PAGE_SHIFT; 3879 3877 parser->patched_cb = hl_cb_get(hdev, &hdev->kernel_cb_mgr, 3880 3878 (u32) patched_cb_handle); 3881 - /* hl_cb_get should never fail here so use kernel WARN */ 3882 - WARN(!parser->patched_cb, "DMA CB handle invalid 0x%x\n", 3883 - (u32) patched_cb_handle); 3879 + /* hl_cb_get should never fail here */ 3884 3880 if (!parser->patched_cb) { 3881 + dev_crit(hdev->dev, "DMA CB handle invalid 0x%x\n", 3882 + (u32) patched_cb_handle); 3885 3883 rc = -EFAULT; 3886 3884 goto out; 3887 3885 } ··· 3950 3948 patched_cb_handle >>= PAGE_SHIFT; 3951 3949 parser->patched_cb = hl_cb_get(hdev, &hdev->kernel_cb_mgr, 3952 3950 (u32) patched_cb_handle); 3953 - /* hl_cb_get should never fail here so use kernel WARN */ 3954 - WARN(!parser->patched_cb, "DMA CB handle invalid 0x%x\n", 3955 - (u32) patched_cb_handle); 3951 + /* hl_cb_get should never fail here */ 3956 3952 if (!parser->patched_cb) { 3953 + dev_crit(hdev->dev, "DMA CB handle invalid 0x%x\n", 3954 + (u32) patched_cb_handle); 3957 3955 rc = -EFAULT; 3958 3956 goto out; 3959 3957 } ··· 4124 4122 if (ddr_bar_addr == U64_MAX) 4125 4123 rc = -EIO; 4126 4124 4127 - } else if (addr >= HOST_PHYS_BASE && !iommu_present(&pci_bus_type)) { 4128 - *val = *(u32 *) phys_to_virt(addr - HOST_PHYS_BASE); 4129 - 4130 4125 } else { 4131 4126 rc = -EFAULT; 4132 4127 } ··· 4177 4178 if (ddr_bar_addr == U64_MAX) 4178 4179 rc = -EIO; 4179 4180 4180 - } else if (addr >= HOST_PHYS_BASE && !iommu_present(&pci_bus_type)) { 4181 - *(u32 *) phys_to_virt(addr - HOST_PHYS_BASE) = val; 4182 - 4183 4181 } else { 4184 4182 rc = -EFAULT; 4185 4183 } ··· 4219 4223 if (ddr_bar_addr == U64_MAX) 4220 4224 rc = -EIO; 4221 4225 4222 - } else if (addr >= HOST_PHYS_BASE && !iommu_present(&pci_bus_type)) { 4223 - *val = *(u64 *) phys_to_virt(addr - HOST_PHYS_BASE); 4224 - 4225 4226 } else { 4226 4227 rc = -EFAULT; 4227 4228 } ··· 4258 4265 } 4259 4266 if (ddr_bar_addr == U64_MAX) 4260 4267 rc = -EIO; 4261 - 4262 - } else if (addr >= HOST_PHYS_BASE && !iommu_present(&pci_bus_type)) { 4263 - *(u64 *) phys_to_virt(addr - HOST_PHYS_BASE) = val; 4264 4268 4265 4269 } else { 4266 4270 rc = -EFAULT; ··· 4867 4877 4868 4878 WREG32(mmTPC_PLL_CLK_RLX_0, 0x200020); 4869 4879 4870 - goya_mmu_prepare(hdev, asid); 4871 - 4872 4880 goya_clear_sm_regs(hdev); 4873 4881 4874 4882 return 0; ··· 5032 5044 return; 5033 5045 5034 5046 if (asid & ~MME_QM_GLBL_SECURE_PROPS_ASID_MASK) { 5035 - WARN(1, "asid %u is too big\n", asid); 5047 + dev_crit(hdev->dev, "asid %u is too big\n", asid); 5036 5048 return; 5037 5049 } 5038 5050 ··· 5061 5073 else 5062 5074 timeout_usec = MMU_CONFIG_TIMEOUT_USEC; 5063 5075 5064 - mutex_lock(&hdev->mmu_cache_lock); 5065 - 5066 5076 /* L0 & L1 invalidation */ 5067 5077 WREG32(mmSTLB_INV_ALL_START, 1); 5068 5078 ··· 5071 5085 !status, 5072 5086 1000, 5073 5087 timeout_usec); 5074 - 5075 - mutex_unlock(&hdev->mmu_cache_lock); 5076 5088 5077 5089 if (rc) { 5078 5090 dev_err_ratelimited(hdev->dev, ··· 5101 5117 else 5102 5118 timeout_usec = MMU_CONFIG_TIMEOUT_USEC; 5103 5119 5104 - mutex_lock(&hdev->mmu_cache_lock); 5105 - 5106 5120 /* 5107 5121 * TODO: currently invalidate entire L0 & L1 as in regular hard 5108 5122 * invalidation. Need to apply invalidation of specific cache lines with ··· 5122 5140 status == pi, 5123 5141 1000, 5124 5142 timeout_usec); 5125 - 5126 - mutex_unlock(&hdev->mmu_cache_lock); 5127 5143 5128 5144 if (rc) { 5129 5145 dev_err_ratelimited(hdev->dev, ··· 5152 5172 if (!(goya->hw_cap_initialized & HW_CAP_CPU_Q)) 5153 5173 return 0; 5154 5174 5155 - rc = hl_fw_cpucp_info_get(hdev, mmCPU_BOOT_DEV_STS0); 5175 + rc = hl_fw_cpucp_info_get(hdev, mmCPU_BOOT_DEV_STS0, mmCPU_BOOT_ERR0); 5156 5176 if (rc) 5157 5177 return rc; 5158 5178 ··· 5187 5207 /* clock gating not supported in Goya */ 5188 5208 } 5189 5209 5190 - static bool goya_is_device_idle(struct hl_device *hdev, u64 *mask, 5191 - struct seq_file *s) 5210 + static bool goya_is_device_idle(struct hl_device *hdev, u64 *mask_arr, 5211 + u8 mask_len, struct seq_file *s) 5192 5212 { 5193 5213 const char *fmt = "%-5d%-9s%#-14x%#-16x%#x\n"; 5194 5214 const char *dma_fmt = "%-5d%-9s%#-14x%#x\n"; 5215 + unsigned long *mask = (unsigned long *)mask_arr; 5195 5216 u32 qm_glbl_sts0, cmdq_glbl_sts0, dma_core_sts0, tpc_cfg_sts, 5196 5217 mme_arch_sts; 5197 5218 bool is_idle = true, is_eng_idle; ··· 5212 5231 IS_DMA_IDLE(dma_core_sts0); 5213 5232 is_idle &= is_eng_idle; 5214 5233 5215 - if (mask) 5216 - *mask |= ((u64) !is_eng_idle) << 5217 - (GOYA_ENGINE_ID_DMA_0 + i); 5234 + if (mask && !is_eng_idle) 5235 + set_bit(GOYA_ENGINE_ID_DMA_0 + i, mask); 5218 5236 if (s) 5219 5237 seq_printf(s, dma_fmt, i, is_eng_idle ? "Y" : "N", 5220 5238 qm_glbl_sts0, dma_core_sts0); ··· 5235 5255 IS_TPC_IDLE(tpc_cfg_sts); 5236 5256 is_idle &= is_eng_idle; 5237 5257 5238 - if (mask) 5239 - *mask |= ((u64) !is_eng_idle) << 5240 - (GOYA_ENGINE_ID_TPC_0 + i); 5258 + if (mask && !is_eng_idle) 5259 + set_bit(GOYA_ENGINE_ID_TPC_0 + i, mask); 5241 5260 if (s) 5242 5261 seq_printf(s, fmt, i, is_eng_idle ? "Y" : "N", 5243 5262 qm_glbl_sts0, cmdq_glbl_sts0, tpc_cfg_sts); ··· 5255 5276 IS_MME_IDLE(mme_arch_sts); 5256 5277 is_idle &= is_eng_idle; 5257 5278 5258 - if (mask) 5259 - *mask |= ((u64) !is_eng_idle) << GOYA_ENGINE_ID_MME_0; 5279 + if (mask && !is_eng_idle) 5280 + set_bit(GOYA_ENGINE_ID_MME_0, mask); 5260 5281 if (s) { 5261 5282 seq_printf(s, fmt, 0, is_eng_idle ? "Y" : "N", qm_glbl_sts0, 5262 5283 cmdq_glbl_sts0, mme_arch_sts); ··· 5300 5321 5301 5322 static int goya_ctx_init(struct hl_ctx *ctx) 5302 5323 { 5324 + if (ctx->asid != HL_KERNEL_ASID_ID) 5325 + goya_mmu_prepare(ctx->hdev, ctx->asid); 5326 + 5303 5327 return 0; 5304 5328 } 5305 5329 ··· 5381 5399 5382 5400 } 5383 5401 5402 + static int goya_get_hw_block_id(struct hl_device *hdev, u64 block_addr, 5403 + u32 *block_id) 5404 + { 5405 + return -EPERM; 5406 + } 5407 + 5408 + static int goya_block_mmap(struct hl_device *hdev, struct vm_area_struct *vma, 5409 + u32 block_id, u32 block_size) 5410 + { 5411 + return -EPERM; 5412 + } 5413 + 5384 5414 static const struct hl_asic_funcs goya_funcs = { 5385 5415 .early_init = goya_early_init, 5386 5416 .early_fini = goya_early_fini, ··· 5469 5475 .set_dma_mask_from_fw = goya_set_dma_mask_from_fw, 5470 5476 .get_device_time = goya_get_device_time, 5471 5477 .collective_wait_init_cs = goya_collective_wait_init_cs, 5472 - .collective_wait_create_jobs = goya_collective_wait_create_jobs 5478 + .collective_wait_create_jobs = goya_collective_wait_create_jobs, 5479 + .scramble_addr = hl_mmu_scramble_addr, 5480 + .descramble_addr = hl_mmu_descramble_addr, 5481 + .ack_protection_bits_errors = goya_ack_protection_bits_errors, 5482 + .get_hw_block_id = goya_get_hw_block_id, 5483 + .hw_block_mmap = goya_block_mmap 5473 5484 }; 5474 5485 5475 5486 /*

+1

drivers/misc/habanalabs/goya/goyaP.h

··· 173 173 void goya_init_tpc_qmans(struct hl_device *hdev); 174 174 int goya_init_cpu_queues(struct hl_device *hdev); 175 175 void goya_init_security(struct hl_device *hdev); 176 + void goya_ack_protection_bits_errors(struct hl_device *hdev); 176 177 int goya_late_init(struct hl_device *hdev); 177 178 void goya_late_fini(struct hl_device *hdev); 178 179

+9 -2

drivers/misc/habanalabs/goya/goya_coresight.c

··· 434 434 WREG32(mmPSOC_ETR_BUFWM, 0x3FFC); 435 435 WREG32(mmPSOC_ETR_RSZ, input->buffer_size); 436 436 WREG32(mmPSOC_ETR_MODE, input->sink_mode); 437 - WREG32(mmPSOC_ETR_AXICTL, 438 - 0x700 | PSOC_ETR_AXICTL_PROTCTRLBIT1_SHIFT); 437 + if (hdev->asic_prop.fw_security_disabled) { 438 + /* make ETR not privileged */ 439 + val = FIELD_PREP(PSOC_ETR_AXICTL_PROTCTRLBIT0_MASK, 0); 440 + /* make ETR non-secured (inverted logic) */ 441 + val |= FIELD_PREP(PSOC_ETR_AXICTL_PROTCTRLBIT1_MASK, 1); 442 + /* burst size 8 */ 443 + val |= FIELD_PREP(PSOC_ETR_AXICTL_WRBURSTLEN_MASK, 7); 444 + WREG32(mmPSOC_ETR_AXICTL, val); 445 + } 439 446 WREG32(mmPSOC_ETR_DBALO, 440 447 lower_32_bits(input->buffer_address)); 441 448 WREG32(mmPSOC_ETR_DBAHI,

+5

drivers/misc/habanalabs/goya/goya_security.c

··· 3120 3120 3121 3121 goya_init_protection_bits(hdev); 3122 3122 } 3123 + 3124 + void goya_ack_protection_bits_errors(struct hl_device *hdev) 3125 + { 3126 + 3127 + }

+14

drivers/misc/habanalabs/include/common/cpucp_if.h

··· 58 58 __u8 pad[7]; 59 59 }; 60 60 61 + enum hl_sm_sei_cause { 62 + SM_SEI_SO_OVERFLOW, 63 + SM_SEI_LBW_4B_UNALIGNED, 64 + SM_SEI_AXI_RESPONSE_ERR 65 + }; 66 + 67 + struct hl_eq_sm_sei_data { 68 + __le32 sei_log; 69 + /* enum hl_sm_sei_cause */ 70 + __u8 sei_cause; 71 + __u8 pad[3]; 72 + }; 73 + 61 74 struct hl_eq_entry { 62 75 struct hl_eq_header hdr; 63 76 union { 64 77 struct hl_eq_ecc_data ecc_data; 65 78 struct hl_eq_hbm_ecc_data hbm_ecc_data; 79 + struct hl_eq_sm_sei_data sm_sei_data; 66 80 __le64 data[7]; 67 81 }; 68 82 };

+14

drivers/misc/habanalabs/include/common/hl_boot_if.h

··· 70 70 * checksum. Trying to program image again 71 71 * might solve this. 72 72 * 73 + * CPU_BOOT_ERR0_PLL_FAIL PLL settings failed, meaning that one 74 + * of the PLLs remains in REF_CLK 75 + * 73 76 * CPU_BOOT_ERR0_ENABLED Error registers enabled. 74 77 * This is a main indication that the 75 78 * running FW populates the error ··· 91 88 #define CPU_BOOT_ERR0_EFUSE_FAIL (1 << 9) 92 89 #define CPU_BOOT_ERR0_PRI_IMG_VER_FAIL (1 << 10) 93 90 #define CPU_BOOT_ERR0_SEC_IMG_VER_FAIL (1 << 11) 91 + #define CPU_BOOT_ERR0_PLL_FAIL (1 << 12) 94 92 #define CPU_BOOT_ERR0_ENABLED (1 << 31) 95 93 96 94 /* ··· 154 150 * CPU_BOOT_DEV_STS0_PLL_INFO_EN FW retrieval of PLL info is enabled. 155 151 * Initialized in: linux 156 152 * 153 + * CPU_BOOT_DEV_STS0_SP_SRAM_EN SP SRAM is initialized and available 154 + * for use. 155 + * Initialized in: preboot 156 + * 157 157 * CPU_BOOT_DEV_STS0_CLK_GATE_EN Clock Gating enabled. 158 158 * FW initialized Clock Gating. 159 159 * Initialized in: preboot 160 + * 161 + * CPU_BOOT_DEV_STS0_HBM_ECC_EN HBM ECC handling Enabled. 162 + * FW handles HBM ECC indications. 163 + * Initialized in: linux 160 164 * 161 165 * CPU_BOOT_DEV_STS0_ENABLED Device status register enabled. 162 166 * This is a main indication that the ··· 187 175 #define CPU_BOOT_DEV_STS0_DRAM_SCR_EN (1 << 9) 188 176 #define CPU_BOOT_DEV_STS0_FW_HARD_RST_EN (1 << 10) 189 177 #define CPU_BOOT_DEV_STS0_PLL_INFO_EN (1 << 11) 178 + #define CPU_BOOT_DEV_STS0_SP_SRAM_EN (1 << 12) 190 179 #define CPU_BOOT_DEV_STS0_CLK_GATE_EN (1 << 13) 180 + #define CPU_BOOT_DEV_STS0_HBM_ECC_EN (1 << 14) 191 181 #define CPU_BOOT_DEV_STS0_ENABLED (1 << 31) 192 182 193 183 enum cpu_boot_status {

+4

drivers/misc/habanalabs/include/gaudi/gaudi_async_events.h

··· 212 212 GAUDI_EVENT_NIC_SEI_2 = 266, 213 213 GAUDI_EVENT_NIC_SEI_3 = 267, 214 214 GAUDI_EVENT_NIC_SEI_4 = 268, 215 + GAUDI_EVENT_DMA_IF_SEI_0 = 277, 216 + GAUDI_EVENT_DMA_IF_SEI_1 = 278, 217 + GAUDI_EVENT_DMA_IF_SEI_2 = 279, 218 + GAUDI_EVENT_DMA_IF_SEI_3 = 280, 215 219 GAUDI_EVENT_PCIE_FLR = 290, 216 220 GAUDI_EVENT_TPC0_BMON_SPMU = 300, 217 221 GAUDI_EVENT_TPC0_KRN_ERR = 301,

+4 -1

drivers/misc/habanalabs/include/gaudi/gaudi_masks.h

··· 388 388 #define RAZWI_INITIATOR_ID_X_Y_TPC6 RAZWI_INITIATOR_ID_X_Y(7, 6) 389 389 #define RAZWI_INITIATOR_ID_X_Y_TPC7_NIC4_NIC5 RAZWI_INITIATOR_ID_X_Y(8, 6) 390 390 391 - #define PSOC_ETR_AXICTL_PROTCTRLBIT1_SHIFT 1 391 + #define PSOC_ETR_AXICTL_PROTCTRLBIT1_SHIFT 1 392 + #define PSOC_ETR_AXICTL_PROTCTRLBIT0_MASK 0x1 393 + #define PSOC_ETR_AXICTL_PROTCTRLBIT1_MASK 0x2 394 + #define PSOC_ETR_AXICTL_WRBURSTLEN_MASK 0xF00 392 395 393 396 /* STLB_CACHE_INV */ 394 397 #define STLB_CACHE_INV_PRODUCER_INDEX_SHIFT 0

+3 -24

drivers/misc/habanalabs/include/gaudi/gaudi_packets.h

··· 78 78 __le64 values[0]; /* data starts here */ 79 79 }; 80 80 81 + #define GAUDI_PKT_LONG_CTL_OP_SHIFT 20 82 + #define GAUDI_PKT_LONG_CTL_OP_MASK 0x00300000 83 + 81 84 struct packet_msg_long { 82 85 __le32 value; 83 86 __le32 ctl; ··· 114 111 #define GAUDI_PKT_SHORT_CTL_BASE_SHIFT 22 115 112 #define GAUDI_PKT_SHORT_CTL_BASE_MASK 0x00C00000 116 113 117 - #define GAUDI_PKT_SHORT_CTL_OPCODE_SHIFT 24 118 - #define GAUDI_PKT_SHORT_CTL_OPCODE_MASK 0x1F000000 119 - 120 - #define GAUDI_PKT_SHORT_CTL_EB_SHIFT 29 121 - #define GAUDI_PKT_SHORT_CTL_EB_MASK 0x20000000 122 - 123 - #define GAUDI_PKT_SHORT_CTL_RB_SHIFT 30 124 - #define GAUDI_PKT_SHORT_CTL_RB_MASK 0x40000000 125 - 126 - #define GAUDI_PKT_SHORT_CTL_MB_SHIFT 31 127 - #define GAUDI_PKT_SHORT_CTL_MB_MASK 0x80000000 128 - 129 114 struct packet_msg_short { 130 115 __le32 value; 131 116 __le32 ctl; ··· 136 145 137 146 #define GAUDI_PKT_FENCE_CTL_PRED_SHIFT 0 138 147 #define GAUDI_PKT_FENCE_CTL_PRED_MASK 0x0000001F 139 - 140 - #define GAUDI_PKT_FENCE_CTL_OPCODE_SHIFT 24 141 - #define GAUDI_PKT_FENCE_CTL_OPCODE_MASK 0x1F000000 142 - 143 - #define GAUDI_PKT_FENCE_CTL_EB_SHIFT 29 144 - #define GAUDI_PKT_FENCE_CTL_EB_MASK 0x20000000 145 - 146 - #define GAUDI_PKT_FENCE_CTL_RB_SHIFT 30 147 - #define GAUDI_PKT_FENCE_CTL_RB_MASK 0x40000000 148 - 149 - #define GAUDI_PKT_FENCE_CTL_MB_SHIFT 31 150 - #define GAUDI_PKT_FENCE_CTL_MB_MASK 0x80000000 151 148 152 149 struct packet_fence { 153 150 __le32 cfg;

+4 -1

drivers/misc/habanalabs/include/goya/asic_reg/goya_masks.h

··· 259 259 #define DMA_QM_3_GLBL_CFG1_DMA_STOP_SHIFT DMA_QM_0_GLBL_CFG1_DMA_STOP_SHIFT 260 260 #define DMA_QM_4_GLBL_CFG1_DMA_STOP_SHIFT DMA_QM_0_GLBL_CFG1_DMA_STOP_SHIFT 261 261 262 - #define PSOC_ETR_AXICTL_PROTCTRLBIT1_SHIFT 1 262 + #define PSOC_ETR_AXICTL_PROTCTRLBIT1_SHIFT 1 263 + #define PSOC_ETR_AXICTL_PROTCTRLBIT0_MASK 0x1 264 + #define PSOC_ETR_AXICTL_PROTCTRLBIT1_MASK 0x2 265 + #define PSOC_ETR_AXICTL_WRBURSTLEN_MASK 0xF00 263 266 264 267 #endif /* ASIC_REG_GOYA_MASKS_H_ */

+43 -13

include/uapi/misc/habanalabs.h

··· 309 309 __u32 num_of_events; 310 310 __u32 device_id; /* PCI Device ID */ 311 311 __u32 module_id; /* For mezzanine cards in servers (From OCP spec.) */ 312 - __u32 reserved[2]; 312 + __u32 reserved; 313 + __u16 first_available_interrupt_id; 314 + __u16 reserved2; 313 315 __u32 cpld_version; 314 316 __u32 psoc_pci_pll_nr; 315 317 __u32 psoc_pci_pll_nf; ··· 322 320 __u8 pad[2]; 323 321 __u8 cpucp_version[HL_INFO_VERSION_MAX_LEN]; 324 322 __u8 card_name[HL_INFO_CARD_NAME_MAX_LEN]; 323 + __u64 reserved3; 324 + __u64 dram_page_size; 325 325 }; 326 326 327 327 struct hl_info_dram_usage { 328 328 __u64 dram_free_mem; 329 329 __u64 ctx_dram_mem; 330 330 }; 331 + 332 + #define HL_BUSY_ENGINES_MASK_EXT_SIZE 2 331 333 332 334 struct hl_info_hw_idle { 333 335 __u32 is_idle; ··· 345 339 * Extended Bitmask of busy engines. 346 340 * Bits definition is according to `enum <chip>_enging_id'. 347 341 */ 348 - __u64 busy_engines_mask_ext; 342 + __u64 busy_engines_mask_ext[HL_BUSY_ENGINES_MASK_EXT_SIZE]; 349 343 }; 350 344 351 345 struct hl_info_device_status { ··· 610 604 }; 611 605 612 606 /* SIGNAL and WAIT/COLLECTIVE_WAIT flags are mutually exclusive */ 613 - #define HL_CS_FLAGS_FORCE_RESTORE 0x1 614 - #define HL_CS_FLAGS_SIGNAL 0x2 615 - #define HL_CS_FLAGS_WAIT 0x4 616 - #define HL_CS_FLAGS_COLLECTIVE_WAIT 0x8 617 - #define HL_CS_FLAGS_TIMESTAMP 0x20 607 + #define HL_CS_FLAGS_FORCE_RESTORE 0x1 608 + #define HL_CS_FLAGS_SIGNAL 0x2 609 + #define HL_CS_FLAGS_WAIT 0x4 610 + #define HL_CS_FLAGS_COLLECTIVE_WAIT 0x8 611 + #define HL_CS_FLAGS_TIMESTAMP 0x20 612 + #define HL_CS_FLAGS_STAGED_SUBMISSION 0x40 613 + #define HL_CS_FLAGS_STAGED_SUBMISSION_FIRST 0x80 614 + #define HL_CS_FLAGS_STAGED_SUBMISSION_LAST 0x100 618 615 619 616 #define HL_CS_STATUS_SUCCESS 0 620 617 ··· 631 622 /* holds address of array of hl_cs_chunk for execution phase */ 632 623 __u64 chunks_execute; 633 624 634 - /* this holds address of array of hl_cs_chunk for store phase - 635 - * Currently not in use 636 - */ 637 - __u64 chunks_store; 625 + union { 626 + /* this holds address of array of hl_cs_chunk for store phase - 627 + * Currently not in use 628 + */ 629 + __u64 chunks_store; 630 + 631 + /* Sequence number of a staged submission CS 632 + * valid only if HL_CS_FLAGS_STAGED_SUBMISSION is set 633 + */ 634 + __u64 seq; 635 + }; 638 636 639 637 /* Number of chunks in restore phase array. Maximum number is 640 638 * HL_MAX_JOBS_PER_CS ··· 720 704 #define HL_MEM_OP_MAP 2 721 705 /* Opcode to unmap previously mapped host and device memory */ 722 706 #define HL_MEM_OP_UNMAP 3 707 + /* Opcode to map a hw block */ 708 + #define HL_MEM_OP_MAP_BLOCK 4 723 709 724 710 /* Memory flags */ 725 711 #define HL_MEM_CONTIGUOUS 0x1 ··· 776 758 __u64 mem_size; 777 759 } map_host; 778 760 761 + /* HL_MEM_OP_MAP_BLOCK - map a hw block */ 762 + struct { 763 + /* 764 + * HW block address to map, a handle will be returned 765 + * to the user and will be used to mmap the relevant 766 + * block. Only addresses from configuration space are 767 + * allowed. 768 + */ 769 + __u64 block_addr; 770 + } map_block; 771 + 779 772 /* HL_MEM_OP_UNMAP - unmap host memory */ 780 773 struct { 781 774 /* Virtual address returned from HL_MEM_OP_MAP */ ··· 813 784 __u64 device_virt_addr; 814 785 815 786 /* 816 - * Used for HL_MEM_OP_ALLOC. This is the assigned 817 - * handle for the allocated memory 787 + * Used for HL_MEM_OP_ALLOC and HL_MEM_OP_MAP_BLOCK. 788 + * This is the assigned handle for the allocated memory 789 + * or mapped block 818 790 */ 819 791 __u64 handle; 820 792 };