Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

net/tls: Add generic NIC offload infrastructure

This patch adds a generic infrastructure to offload TLS crypto to a
network device. It enables the kernel TLS socket to skip encryption
and authentication operations on the transmit side of the data path.
Leaving those computationally expensive operations to the NIC.

The NIC offload infrastructure builds TLS records and pushes them to
the TCP layer just like the SW KTLS implementation and using the same
API.
TCP segmentation is mostly unaffected. Currently the only exception is
that we prevent mixed SKBs where only part of the payload requires
offload. In the future we are likely to add a similar restriction
following a change cipher spec record.

The notable differences between SW KTLS and NIC offloaded TLS
implementations are as follows:
1. The offloaded implementation builds "plaintext TLS record", those
records contain plaintext instead of ciphertext and place holder bytes
instead of authentication tags.
2. The offloaded implementation maintains a mapping from TCP sequence
number to TLS records. Thus given a TCP SKB sent from a NIC offloaded
TLS socket, we can use the tls NIC offload infrastructure to obtain
enough context to encrypt the payload of the SKB.
A TLS record is released when the last byte of the record is ack'ed,
this is done through the new icsk_clean_acked callback.

The infrastructure should be extendable to support various NIC offload
implementations. However it is currently written with the
implementation below in mind:
The NIC assumes that packets from each offloaded stream are sent as
plaintext and in-order. It keeps track of the TLS records in the TCP
stream. When a packet marked for offload is transmitted, the NIC
encrypts the payload in-place and puts authentication tags in the
relevant place holders.

The responsibility for handling out-of-order packets (i.e. TCP
retransmission, qdisc drops) falls on the netdev driver.

The netdev driver keeps track of the expected TCP SN from the NIC's
perspective. If the next packet to transmit matches the expected TCP
SN, the driver advances the expected TCP SN, and transmits the packet
with TLS offload indication.

If the next packet to transmit does not match the expected TCP SN. The
driver calls the TLS layer to obtain the TLS record that includes the
TCP of the packet for transmission. Using this TLS record, the driver
posts a work entry on the transmit queue to reconstruct the NIC TLS
state required for the offload of the out-of-order packet. It updates
the expected TCP SN accordingly and transmits the now in-order packet.
The same queue is used for packet transmission and TLS context
reconstruction to avoid the need for flushing the transmit queue before
issuing the context reconstruction request.

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

authored by

Ilya Lesokhin and committed by
David S. Miller
e8f69799 f66de3ee

+1332 -5
+67 -2
include/net/tls.h
··· 116 116 bool decrypted; 117 117 }; 118 118 119 + struct tls_record_info { 120 + struct list_head list; 121 + u32 end_seq; 122 + int len; 123 + int num_frags; 124 + skb_frag_t frags[MAX_SKB_FRAGS]; 125 + }; 126 + 127 + struct tls_offload_context { 128 + struct crypto_aead *aead_send; 129 + spinlock_t lock; /* protects records list */ 130 + struct list_head records_list; 131 + struct tls_record_info *open_record; 132 + struct tls_record_info *retransmit_hint; 133 + u64 hint_record_sn; 134 + u64 unacked_record_sn; 135 + 136 + struct scatterlist sg_tx_data[MAX_SKB_FRAGS]; 137 + void (*sk_destruct)(struct sock *sk); 138 + u8 driver_state[]; 139 + /* The TLS layer reserves room for driver specific state 140 + * Currently the belief is that there is not enough 141 + * driver specific state to justify another layer of indirection 142 + */ 143 + #define TLS_DRIVER_STATE_SIZE (max_t(size_t, 8, sizeof(void *))) 144 + }; 145 + 146 + #define TLS_OFFLOAD_CONTEXT_SIZE \ 147 + (ALIGN(sizeof(struct tls_offload_context), sizeof(void *)) + \ 148 + TLS_DRIVER_STATE_SIZE) 149 + 119 150 enum { 120 151 TLS_PENDING_CLOSED_RECORD 121 152 }; ··· 226 195 struct pipe_inode_info *pipe, 227 196 size_t len, unsigned int flags); 228 197 229 - void tls_sk_destruct(struct sock *sk, struct tls_context *ctx); 230 - void tls_icsk_clean_acked(struct sock *sk); 198 + int tls_set_device_offload(struct sock *sk, struct tls_context *ctx); 199 + int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size); 200 + int tls_device_sendpage(struct sock *sk, struct page *page, 201 + int offset, size_t size, int flags); 202 + void tls_device_sk_destruct(struct sock *sk); 203 + void tls_device_init(void); 204 + void tls_device_cleanup(void); 231 205 206 + struct tls_record_info *tls_get_record(struct tls_offload_context *context, 207 + u32 seq, u64 *p_record_sn); 208 + 209 + static inline bool tls_record_is_start_marker(struct tls_record_info *rec) 210 + { 211 + return rec->len == 0; 212 + } 213 + 214 + static inline u32 tls_record_start_seq(struct tls_record_info *rec) 215 + { 216 + return rec->end_seq - rec->len; 217 + } 218 + 219 + void tls_sk_destruct(struct sock *sk, struct tls_context *ctx); 232 220 int tls_push_sg(struct sock *sk, struct tls_context *ctx, 233 221 struct scatterlist *sg, u16 first_offset, 234 222 int flags); ··· 282 232 static inline bool tls_is_pending_open_record(struct tls_context *tls_ctx) 283 233 { 284 234 return tls_ctx->pending_open_record_frags; 235 + } 236 + 237 + static inline bool tls_is_sk_tx_device_offloaded(struct sock *sk) 238 + { 239 + return sk_fullsock(sk) && 240 + /* matches smp_store_release in tls_set_device_offload */ 241 + smp_load_acquire(&sk->sk_destruct) == &tls_device_sk_destruct; 285 242 } 286 243 287 244 static inline void tls_err_abort(struct sock *sk, int err) ··· 385 328 unsigned char *record_type); 386 329 void tls_register_device(struct tls_device *device); 387 330 void tls_unregister_device(struct tls_device *device); 331 + 332 + struct sk_buff *tls_validate_xmit_skb(struct sock *sk, 333 + struct net_device *dev, 334 + struct sk_buff *skb); 335 + 336 + int tls_sw_fallback_init(struct sock *sk, 337 + struct tls_offload_context *offload_ctx, 338 + struct tls_crypto_info *crypto_info); 388 339 389 340 #endif /* _TLS_OFFLOAD_H */
+10
net/tls/Kconfig
··· 14 14 encryption handling of the TLS protocol to be done in-kernel. 15 15 16 16 If unsure, say N. 17 + 18 + config TLS_DEVICE 19 + bool "Transport Layer Security HW offload" 20 + depends on TLS 21 + select SOCK_VALIDATE_XMIT 22 + default n 23 + help 24 + Enable kernel support for HW offload of the TLS protocol. 25 + 26 + If unsure, say N.
+2
net/tls/Makefile
··· 5 5 obj-$(CONFIG_TLS) += tls.o 6 6 7 7 tls-y := tls_main.o tls_sw.o 8 + 9 + tls-$(CONFIG_TLS_DEVICE) += tls_device.o tls_device_fallback.o
+764
net/tls/tls_device.c
··· 1 + /* Copyright (c) 2018, Mellanox Technologies All rights reserved. 2 + * 3 + * This software is available to you under a choice of one of two 4 + * licenses. You may choose to be licensed under the terms of the GNU 5 + * General Public License (GPL) Version 2, available from the file 6 + * COPYING in the main directory of this source tree, or the 7 + * OpenIB.org BSD license below: 8 + * 9 + * Redistribution and use in source and binary forms, with or 10 + * without modification, are permitted provided that the following 11 + * conditions are met: 12 + * 13 + * - Redistributions of source code must retain the above 14 + * copyright notice, this list of conditions and the following 15 + * disclaimer. 16 + * 17 + * - Redistributions in binary form must reproduce the above 18 + * copyright notice, this list of conditions and the following 19 + * disclaimer in the documentation and/or other materials 20 + * provided with the distribution. 21 + * 22 + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 23 + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 24 + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 25 + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 26 + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 27 + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 28 + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 29 + * SOFTWARE. 30 + */ 31 + 32 + #include <crypto/aead.h> 33 + #include <linux/highmem.h> 34 + #include <linux/module.h> 35 + #include <linux/netdevice.h> 36 + #include <net/dst.h> 37 + #include <net/inet_connection_sock.h> 38 + #include <net/tcp.h> 39 + #include <net/tls.h> 40 + 41 + /* device_offload_lock is used to synchronize tls_dev_add 42 + * against NETDEV_DOWN notifications. 43 + */ 44 + static DECLARE_RWSEM(device_offload_lock); 45 + 46 + static void tls_device_gc_task(struct work_struct *work); 47 + 48 + static DECLARE_WORK(tls_device_gc_work, tls_device_gc_task); 49 + static LIST_HEAD(tls_device_gc_list); 50 + static LIST_HEAD(tls_device_list); 51 + static DEFINE_SPINLOCK(tls_device_lock); 52 + 53 + static void tls_device_free_ctx(struct tls_context *ctx) 54 + { 55 + struct tls_offload_context *offload_ctx = tls_offload_ctx(ctx); 56 + 57 + kfree(offload_ctx); 58 + kfree(ctx); 59 + } 60 + 61 + static void tls_device_gc_task(struct work_struct *work) 62 + { 63 + struct tls_context *ctx, *tmp; 64 + unsigned long flags; 65 + LIST_HEAD(gc_list); 66 + 67 + spin_lock_irqsave(&tls_device_lock, flags); 68 + list_splice_init(&tls_device_gc_list, &gc_list); 69 + spin_unlock_irqrestore(&tls_device_lock, flags); 70 + 71 + list_for_each_entry_safe(ctx, tmp, &gc_list, list) { 72 + struct net_device *netdev = ctx->netdev; 73 + 74 + if (netdev) { 75 + netdev->tlsdev_ops->tls_dev_del(netdev, ctx, 76 + TLS_OFFLOAD_CTX_DIR_TX); 77 + dev_put(netdev); 78 + } 79 + 80 + list_del(&ctx->list); 81 + tls_device_free_ctx(ctx); 82 + } 83 + } 84 + 85 + static void tls_device_queue_ctx_destruction(struct tls_context *ctx) 86 + { 87 + unsigned long flags; 88 + 89 + spin_lock_irqsave(&tls_device_lock, flags); 90 + list_move_tail(&ctx->list, &tls_device_gc_list); 91 + 92 + /* schedule_work inside the spinlock 93 + * to make sure tls_device_down waits for that work. 94 + */ 95 + schedule_work(&tls_device_gc_work); 96 + 97 + spin_unlock_irqrestore(&tls_device_lock, flags); 98 + } 99 + 100 + /* We assume that the socket is already connected */ 101 + static struct net_device *get_netdev_for_sock(struct sock *sk) 102 + { 103 + struct dst_entry *dst = sk_dst_get(sk); 104 + struct net_device *netdev = NULL; 105 + 106 + if (likely(dst)) { 107 + netdev = dst->dev; 108 + dev_hold(netdev); 109 + } 110 + 111 + dst_release(dst); 112 + 113 + return netdev; 114 + } 115 + 116 + static void destroy_record(struct tls_record_info *record) 117 + { 118 + int nr_frags = record->num_frags; 119 + skb_frag_t *frag; 120 + 121 + while (nr_frags-- > 0) { 122 + frag = &record->frags[nr_frags]; 123 + __skb_frag_unref(frag); 124 + } 125 + kfree(record); 126 + } 127 + 128 + static void delete_all_records(struct tls_offload_context *offload_ctx) 129 + { 130 + struct tls_record_info *info, *temp; 131 + 132 + list_for_each_entry_safe(info, temp, &offload_ctx->records_list, list) { 133 + list_del(&info->list); 134 + destroy_record(info); 135 + } 136 + 137 + offload_ctx->retransmit_hint = NULL; 138 + } 139 + 140 + static void tls_icsk_clean_acked(struct sock *sk, u32 acked_seq) 141 + { 142 + struct tls_context *tls_ctx = tls_get_ctx(sk); 143 + struct tls_record_info *info, *temp; 144 + struct tls_offload_context *ctx; 145 + u64 deleted_records = 0; 146 + unsigned long flags; 147 + 148 + if (!tls_ctx) 149 + return; 150 + 151 + ctx = tls_offload_ctx(tls_ctx); 152 + 153 + spin_lock_irqsave(&ctx->lock, flags); 154 + info = ctx->retransmit_hint; 155 + if (info && !before(acked_seq, info->end_seq)) { 156 + ctx->retransmit_hint = NULL; 157 + list_del(&info->list); 158 + destroy_record(info); 159 + deleted_records++; 160 + } 161 + 162 + list_for_each_entry_safe(info, temp, &ctx->records_list, list) { 163 + if (before(acked_seq, info->end_seq)) 164 + break; 165 + list_del(&info->list); 166 + 167 + destroy_record(info); 168 + deleted_records++; 169 + } 170 + 171 + ctx->unacked_record_sn += deleted_records; 172 + spin_unlock_irqrestore(&ctx->lock, flags); 173 + } 174 + 175 + /* At this point, there should be no references on this 176 + * socket and no in-flight SKBs associated with this 177 + * socket, so it is safe to free all the resources. 178 + */ 179 + void tls_device_sk_destruct(struct sock *sk) 180 + { 181 + struct tls_context *tls_ctx = tls_get_ctx(sk); 182 + struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx); 183 + 184 + if (ctx->open_record) 185 + destroy_record(ctx->open_record); 186 + 187 + delete_all_records(ctx); 188 + crypto_free_aead(ctx->aead_send); 189 + ctx->sk_destruct(sk); 190 + clean_acked_data_disable(inet_csk(sk)); 191 + 192 + if (refcount_dec_and_test(&tls_ctx->refcount)) 193 + tls_device_queue_ctx_destruction(tls_ctx); 194 + } 195 + EXPORT_SYMBOL(tls_device_sk_destruct); 196 + 197 + static void tls_append_frag(struct tls_record_info *record, 198 + struct page_frag *pfrag, 199 + int size) 200 + { 201 + skb_frag_t *frag; 202 + 203 + frag = &record->frags[record->num_frags - 1]; 204 + if (frag->page.p == pfrag->page && 205 + frag->page_offset + frag->size == pfrag->offset) { 206 + frag->size += size; 207 + } else { 208 + ++frag; 209 + frag->page.p = pfrag->page; 210 + frag->page_offset = pfrag->offset; 211 + frag->size = size; 212 + ++record->num_frags; 213 + get_page(pfrag->page); 214 + } 215 + 216 + pfrag->offset += size; 217 + record->len += size; 218 + } 219 + 220 + static int tls_push_record(struct sock *sk, 221 + struct tls_context *ctx, 222 + struct tls_offload_context *offload_ctx, 223 + struct tls_record_info *record, 224 + struct page_frag *pfrag, 225 + int flags, 226 + unsigned char record_type) 227 + { 228 + struct tcp_sock *tp = tcp_sk(sk); 229 + struct page_frag dummy_tag_frag; 230 + skb_frag_t *frag; 231 + int i; 232 + 233 + /* fill prepend */ 234 + frag = &record->frags[0]; 235 + tls_fill_prepend(ctx, 236 + skb_frag_address(frag), 237 + record->len - ctx->tx.prepend_size, 238 + record_type); 239 + 240 + /* HW doesn't care about the data in the tag, because it fills it. */ 241 + dummy_tag_frag.page = skb_frag_page(frag); 242 + dummy_tag_frag.offset = 0; 243 + 244 + tls_append_frag(record, &dummy_tag_frag, ctx->tx.tag_size); 245 + record->end_seq = tp->write_seq + record->len; 246 + spin_lock_irq(&offload_ctx->lock); 247 + list_add_tail(&record->list, &offload_ctx->records_list); 248 + spin_unlock_irq(&offload_ctx->lock); 249 + offload_ctx->open_record = NULL; 250 + set_bit(TLS_PENDING_CLOSED_RECORD, &ctx->flags); 251 + tls_advance_record_sn(sk, &ctx->tx); 252 + 253 + for (i = 0; i < record->num_frags; i++) { 254 + frag = &record->frags[i]; 255 + sg_unmark_end(&offload_ctx->sg_tx_data[i]); 256 + sg_set_page(&offload_ctx->sg_tx_data[i], skb_frag_page(frag), 257 + frag->size, frag->page_offset); 258 + sk_mem_charge(sk, frag->size); 259 + get_page(skb_frag_page(frag)); 260 + } 261 + sg_mark_end(&offload_ctx->sg_tx_data[record->num_frags - 1]); 262 + 263 + /* all ready, send */ 264 + return tls_push_sg(sk, ctx, offload_ctx->sg_tx_data, 0, flags); 265 + } 266 + 267 + static int tls_create_new_record(struct tls_offload_context *offload_ctx, 268 + struct page_frag *pfrag, 269 + size_t prepend_size) 270 + { 271 + struct tls_record_info *record; 272 + skb_frag_t *frag; 273 + 274 + record = kmalloc(sizeof(*record), GFP_KERNEL); 275 + if (!record) 276 + return -ENOMEM; 277 + 278 + frag = &record->frags[0]; 279 + __skb_frag_set_page(frag, pfrag->page); 280 + frag->page_offset = pfrag->offset; 281 + skb_frag_size_set(frag, prepend_size); 282 + 283 + get_page(pfrag->page); 284 + pfrag->offset += prepend_size; 285 + 286 + record->num_frags = 1; 287 + record->len = prepend_size; 288 + offload_ctx->open_record = record; 289 + return 0; 290 + } 291 + 292 + static int tls_do_allocation(struct sock *sk, 293 + struct tls_offload_context *offload_ctx, 294 + struct page_frag *pfrag, 295 + size_t prepend_size) 296 + { 297 + int ret; 298 + 299 + if (!offload_ctx->open_record) { 300 + if (unlikely(!skb_page_frag_refill(prepend_size, pfrag, 301 + sk->sk_allocation))) { 302 + sk->sk_prot->enter_memory_pressure(sk); 303 + sk_stream_moderate_sndbuf(sk); 304 + return -ENOMEM; 305 + } 306 + 307 + ret = tls_create_new_record(offload_ctx, pfrag, prepend_size); 308 + if (ret) 309 + return ret; 310 + 311 + if (pfrag->size > pfrag->offset) 312 + return 0; 313 + } 314 + 315 + if (!sk_page_frag_refill(sk, pfrag)) 316 + return -ENOMEM; 317 + 318 + return 0; 319 + } 320 + 321 + static int tls_push_data(struct sock *sk, 322 + struct iov_iter *msg_iter, 323 + size_t size, int flags, 324 + unsigned char record_type) 325 + { 326 + struct tls_context *tls_ctx = tls_get_ctx(sk); 327 + struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx); 328 + int tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST; 329 + int more = flags & (MSG_SENDPAGE_NOTLAST | MSG_MORE); 330 + struct tls_record_info *record = ctx->open_record; 331 + struct page_frag *pfrag; 332 + size_t orig_size = size; 333 + u32 max_open_record_len; 334 + int copy, rc = 0; 335 + bool done = false; 336 + long timeo; 337 + 338 + if (flags & 339 + ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_SENDPAGE_NOTLAST)) 340 + return -ENOTSUPP; 341 + 342 + if (sk->sk_err) 343 + return -sk->sk_err; 344 + 345 + timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); 346 + rc = tls_complete_pending_work(sk, tls_ctx, flags, &timeo); 347 + if (rc < 0) 348 + return rc; 349 + 350 + pfrag = sk_page_frag(sk); 351 + 352 + /* TLS_HEADER_SIZE is not counted as part of the TLS record, and 353 + * we need to leave room for an authentication tag. 354 + */ 355 + max_open_record_len = TLS_MAX_PAYLOAD_SIZE + 356 + tls_ctx->tx.prepend_size; 357 + do { 358 + rc = tls_do_allocation(sk, ctx, pfrag, 359 + tls_ctx->tx.prepend_size); 360 + if (rc) { 361 + rc = sk_stream_wait_memory(sk, &timeo); 362 + if (!rc) 363 + continue; 364 + 365 + record = ctx->open_record; 366 + if (!record) 367 + break; 368 + handle_error: 369 + if (record_type != TLS_RECORD_TYPE_DATA) { 370 + /* avoid sending partial 371 + * record with type != 372 + * application_data 373 + */ 374 + size = orig_size; 375 + destroy_record(record); 376 + ctx->open_record = NULL; 377 + } else if (record->len > tls_ctx->tx.prepend_size) { 378 + goto last_record; 379 + } 380 + 381 + break; 382 + } 383 + 384 + record = ctx->open_record; 385 + copy = min_t(size_t, size, (pfrag->size - pfrag->offset)); 386 + copy = min_t(size_t, copy, (max_open_record_len - record->len)); 387 + 388 + if (copy_from_iter_nocache(page_address(pfrag->page) + 389 + pfrag->offset, 390 + copy, msg_iter) != copy) { 391 + rc = -EFAULT; 392 + goto handle_error; 393 + } 394 + tls_append_frag(record, pfrag, copy); 395 + 396 + size -= copy; 397 + if (!size) { 398 + last_record: 399 + tls_push_record_flags = flags; 400 + if (more) { 401 + tls_ctx->pending_open_record_frags = 402 + record->num_frags; 403 + break; 404 + } 405 + 406 + done = true; 407 + } 408 + 409 + if (done || record->len >= max_open_record_len || 410 + (record->num_frags >= MAX_SKB_FRAGS - 1)) { 411 + rc = tls_push_record(sk, 412 + tls_ctx, 413 + ctx, 414 + record, 415 + pfrag, 416 + tls_push_record_flags, 417 + record_type); 418 + if (rc < 0) 419 + break; 420 + } 421 + } while (!done); 422 + 423 + if (orig_size - size > 0) 424 + rc = orig_size - size; 425 + 426 + return rc; 427 + } 428 + 429 + int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) 430 + { 431 + unsigned char record_type = TLS_RECORD_TYPE_DATA; 432 + int rc; 433 + 434 + lock_sock(sk); 435 + 436 + if (unlikely(msg->msg_controllen)) { 437 + rc = tls_proccess_cmsg(sk, msg, &record_type); 438 + if (rc) 439 + goto out; 440 + } 441 + 442 + rc = tls_push_data(sk, &msg->msg_iter, size, 443 + msg->msg_flags, record_type); 444 + 445 + out: 446 + release_sock(sk); 447 + return rc; 448 + } 449 + 450 + int tls_device_sendpage(struct sock *sk, struct page *page, 451 + int offset, size_t size, int flags) 452 + { 453 + struct iov_iter msg_iter; 454 + char *kaddr = kmap(page); 455 + struct kvec iov; 456 + int rc; 457 + 458 + if (flags & MSG_SENDPAGE_NOTLAST) 459 + flags |= MSG_MORE; 460 + 461 + lock_sock(sk); 462 + 463 + if (flags & MSG_OOB) { 464 + rc = -ENOTSUPP; 465 + goto out; 466 + } 467 + 468 + iov.iov_base = kaddr + offset; 469 + iov.iov_len = size; 470 + iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, &iov, 1, size); 471 + rc = tls_push_data(sk, &msg_iter, size, 472 + flags, TLS_RECORD_TYPE_DATA); 473 + kunmap(page); 474 + 475 + out: 476 + release_sock(sk); 477 + return rc; 478 + } 479 + 480 + struct tls_record_info *tls_get_record(struct tls_offload_context *context, 481 + u32 seq, u64 *p_record_sn) 482 + { 483 + u64 record_sn = context->hint_record_sn; 484 + struct tls_record_info *info; 485 + 486 + info = context->retransmit_hint; 487 + if (!info || 488 + before(seq, info->end_seq - info->len)) { 489 + /* if retransmit_hint is irrelevant start 490 + * from the beggining of the list 491 + */ 492 + info = list_first_entry(&context->records_list, 493 + struct tls_record_info, list); 494 + record_sn = context->unacked_record_sn; 495 + } 496 + 497 + list_for_each_entry_from(info, &context->records_list, list) { 498 + if (before(seq, info->end_seq)) { 499 + if (!context->retransmit_hint || 500 + after(info->end_seq, 501 + context->retransmit_hint->end_seq)) { 502 + context->hint_record_sn = record_sn; 503 + context->retransmit_hint = info; 504 + } 505 + *p_record_sn = record_sn; 506 + return info; 507 + } 508 + record_sn++; 509 + } 510 + 511 + return NULL; 512 + } 513 + EXPORT_SYMBOL(tls_get_record); 514 + 515 + static int tls_device_push_pending_record(struct sock *sk, int flags) 516 + { 517 + struct iov_iter msg_iter; 518 + 519 + iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, NULL, 0, 0); 520 + return tls_push_data(sk, &msg_iter, 0, flags, TLS_RECORD_TYPE_DATA); 521 + } 522 + 523 + int tls_set_device_offload(struct sock *sk, struct tls_context *ctx) 524 + { 525 + u16 nonce_size, tag_size, iv_size, rec_seq_size; 526 + struct tls_record_info *start_marker_record; 527 + struct tls_offload_context *offload_ctx; 528 + struct tls_crypto_info *crypto_info; 529 + struct net_device *netdev; 530 + char *iv, *rec_seq; 531 + struct sk_buff *skb; 532 + int rc = -EINVAL; 533 + __be64 rcd_sn; 534 + 535 + if (!ctx) 536 + goto out; 537 + 538 + if (ctx->priv_ctx_tx) { 539 + rc = -EEXIST; 540 + goto out; 541 + } 542 + 543 + start_marker_record = kmalloc(sizeof(*start_marker_record), GFP_KERNEL); 544 + if (!start_marker_record) { 545 + rc = -ENOMEM; 546 + goto out; 547 + } 548 + 549 + offload_ctx = kzalloc(TLS_OFFLOAD_CONTEXT_SIZE, GFP_KERNEL); 550 + if (!offload_ctx) { 551 + rc = -ENOMEM; 552 + goto free_marker_record; 553 + } 554 + 555 + crypto_info = &ctx->crypto_send; 556 + switch (crypto_info->cipher_type) { 557 + case TLS_CIPHER_AES_GCM_128: 558 + nonce_size = TLS_CIPHER_AES_GCM_128_IV_SIZE; 559 + tag_size = TLS_CIPHER_AES_GCM_128_TAG_SIZE; 560 + iv_size = TLS_CIPHER_AES_GCM_128_IV_SIZE; 561 + iv = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->iv; 562 + rec_seq_size = TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE; 563 + rec_seq = 564 + ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->rec_seq; 565 + break; 566 + default: 567 + rc = -EINVAL; 568 + goto free_offload_ctx; 569 + } 570 + 571 + ctx->tx.prepend_size = TLS_HEADER_SIZE + nonce_size; 572 + ctx->tx.tag_size = tag_size; 573 + ctx->tx.overhead_size = ctx->tx.prepend_size + ctx->tx.tag_size; 574 + ctx->tx.iv_size = iv_size; 575 + ctx->tx.iv = kmalloc(iv_size + TLS_CIPHER_AES_GCM_128_SALT_SIZE, 576 + GFP_KERNEL); 577 + if (!ctx->tx.iv) { 578 + rc = -ENOMEM; 579 + goto free_offload_ctx; 580 + } 581 + 582 + memcpy(ctx->tx.iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, iv, iv_size); 583 + 584 + ctx->tx.rec_seq_size = rec_seq_size; 585 + ctx->tx.rec_seq = kmalloc(rec_seq_size, GFP_KERNEL); 586 + if (!ctx->tx.rec_seq) { 587 + rc = -ENOMEM; 588 + goto free_iv; 589 + } 590 + memcpy(ctx->tx.rec_seq, rec_seq, rec_seq_size); 591 + 592 + rc = tls_sw_fallback_init(sk, offload_ctx, crypto_info); 593 + if (rc) 594 + goto free_rec_seq; 595 + 596 + /* start at rec_seq - 1 to account for the start marker record */ 597 + memcpy(&rcd_sn, ctx->tx.rec_seq, sizeof(rcd_sn)); 598 + offload_ctx->unacked_record_sn = be64_to_cpu(rcd_sn) - 1; 599 + 600 + start_marker_record->end_seq = tcp_sk(sk)->write_seq; 601 + start_marker_record->len = 0; 602 + start_marker_record->num_frags = 0; 603 + 604 + INIT_LIST_HEAD(&offload_ctx->records_list); 605 + list_add_tail(&start_marker_record->list, &offload_ctx->records_list); 606 + spin_lock_init(&offload_ctx->lock); 607 + 608 + clean_acked_data_enable(inet_csk(sk), &tls_icsk_clean_acked); 609 + ctx->push_pending_record = tls_device_push_pending_record; 610 + offload_ctx->sk_destruct = sk->sk_destruct; 611 + 612 + /* TLS offload is greatly simplified if we don't send 613 + * SKBs where only part of the payload needs to be encrypted. 614 + * So mark the last skb in the write queue as end of record. 615 + */ 616 + skb = tcp_write_queue_tail(sk); 617 + if (skb) 618 + TCP_SKB_CB(skb)->eor = 1; 619 + 620 + refcount_set(&ctx->refcount, 1); 621 + 622 + /* We support starting offload on multiple sockets 623 + * concurrently, so we only need a read lock here. 624 + * This lock must precede get_netdev_for_sock to prevent races between 625 + * NETDEV_DOWN and setsockopt. 626 + */ 627 + down_read(&device_offload_lock); 628 + netdev = get_netdev_for_sock(sk); 629 + if (!netdev) { 630 + pr_err_ratelimited("%s: netdev not found\n", __func__); 631 + rc = -EINVAL; 632 + goto release_lock; 633 + } 634 + 635 + if (!(netdev->features & NETIF_F_HW_TLS_TX)) { 636 + rc = -ENOTSUPP; 637 + goto release_netdev; 638 + } 639 + 640 + /* Avoid offloading if the device is down 641 + * We don't want to offload new flows after 642 + * the NETDEV_DOWN event 643 + */ 644 + if (!(netdev->flags & IFF_UP)) { 645 + rc = -EINVAL; 646 + goto release_netdev; 647 + } 648 + 649 + ctx->priv_ctx_tx = offload_ctx; 650 + rc = netdev->tlsdev_ops->tls_dev_add(netdev, sk, TLS_OFFLOAD_CTX_DIR_TX, 651 + &ctx->crypto_send, 652 + tcp_sk(sk)->write_seq); 653 + if (rc) 654 + goto release_netdev; 655 + 656 + ctx->netdev = netdev; 657 + 658 + spin_lock_irq(&tls_device_lock); 659 + list_add_tail(&ctx->list, &tls_device_list); 660 + spin_unlock_irq(&tls_device_lock); 661 + 662 + sk->sk_validate_xmit_skb = tls_validate_xmit_skb; 663 + /* following this assignment tls_is_sk_tx_device_offloaded 664 + * will return true and the context might be accessed 665 + * by the netdev's xmit function. 666 + */ 667 + smp_store_release(&sk->sk_destruct, 668 + &tls_device_sk_destruct); 669 + up_read(&device_offload_lock); 670 + goto out; 671 + 672 + release_netdev: 673 + dev_put(netdev); 674 + release_lock: 675 + up_read(&device_offload_lock); 676 + clean_acked_data_disable(inet_csk(sk)); 677 + crypto_free_aead(offload_ctx->aead_send); 678 + free_rec_seq: 679 + kfree(ctx->tx.rec_seq); 680 + free_iv: 681 + kfree(ctx->tx.iv); 682 + free_offload_ctx: 683 + kfree(offload_ctx); 684 + ctx->priv_ctx_tx = NULL; 685 + free_marker_record: 686 + kfree(start_marker_record); 687 + out: 688 + return rc; 689 + } 690 + 691 + static int tls_device_down(struct net_device *netdev) 692 + { 693 + struct tls_context *ctx, *tmp; 694 + unsigned long flags; 695 + LIST_HEAD(list); 696 + 697 + /* Request a write lock to block new offload attempts */ 698 + down_write(&device_offload_lock); 699 + 700 + spin_lock_irqsave(&tls_device_lock, flags); 701 + list_for_each_entry_safe(ctx, tmp, &tls_device_list, list) { 702 + if (ctx->netdev != netdev || 703 + !refcount_inc_not_zero(&ctx->refcount)) 704 + continue; 705 + 706 + list_move(&ctx->list, &list); 707 + } 708 + spin_unlock_irqrestore(&tls_device_lock, flags); 709 + 710 + list_for_each_entry_safe(ctx, tmp, &list, list) { 711 + netdev->tlsdev_ops->tls_dev_del(netdev, ctx, 712 + TLS_OFFLOAD_CTX_DIR_TX); 713 + ctx->netdev = NULL; 714 + dev_put(netdev); 715 + list_del_init(&ctx->list); 716 + 717 + if (refcount_dec_and_test(&ctx->refcount)) 718 + tls_device_free_ctx(ctx); 719 + } 720 + 721 + up_write(&device_offload_lock); 722 + 723 + flush_work(&tls_device_gc_work); 724 + 725 + return NOTIFY_DONE; 726 + } 727 + 728 + static int tls_dev_event(struct notifier_block *this, unsigned long event, 729 + void *ptr) 730 + { 731 + struct net_device *dev = netdev_notifier_info_to_dev(ptr); 732 + 733 + if (!(dev->features & NETIF_F_HW_TLS_TX)) 734 + return NOTIFY_DONE; 735 + 736 + switch (event) { 737 + case NETDEV_REGISTER: 738 + case NETDEV_FEAT_CHANGE: 739 + if (dev->tlsdev_ops && 740 + dev->tlsdev_ops->tls_dev_add && 741 + dev->tlsdev_ops->tls_dev_del) 742 + return NOTIFY_DONE; 743 + else 744 + return NOTIFY_BAD; 745 + case NETDEV_DOWN: 746 + return tls_device_down(dev); 747 + } 748 + return NOTIFY_DONE; 749 + } 750 + 751 + static struct notifier_block tls_dev_notifier = { 752 + .notifier_call = tls_dev_event, 753 + }; 754 + 755 + void __init tls_device_init(void) 756 + { 757 + register_netdevice_notifier(&tls_dev_notifier); 758 + } 759 + 760 + void __exit tls_device_cleanup(void) 761 + { 762 + unregister_netdevice_notifier(&tls_dev_notifier); 763 + flush_work(&tls_device_gc_work); 764 + }
+450
net/tls/tls_device_fallback.c
··· 1 + /* Copyright (c) 2018, Mellanox Technologies All rights reserved. 2 + * 3 + * This software is available to you under a choice of one of two 4 + * licenses. You may choose to be licensed under the terms of the GNU 5 + * General Public License (GPL) Version 2, available from the file 6 + * COPYING in the main directory of this source tree, or the 7 + * OpenIB.org BSD license below: 8 + * 9 + * Redistribution and use in source and binary forms, with or 10 + * without modification, are permitted provided that the following 11 + * conditions are met: 12 + * 13 + * - Redistributions of source code must retain the above 14 + * copyright notice, this list of conditions and the following 15 + * disclaimer. 16 + * 17 + * - Redistributions in binary form must reproduce the above 18 + * copyright notice, this list of conditions and the following 19 + * disclaimer in the documentation and/or other materials 20 + * provided with the distribution. 21 + * 22 + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 23 + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 24 + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 25 + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 26 + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 27 + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 28 + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 29 + * SOFTWARE. 30 + */ 31 + 32 + #include <net/tls.h> 33 + #include <crypto/aead.h> 34 + #include <crypto/scatterwalk.h> 35 + #include <net/ip6_checksum.h> 36 + 37 + static void chain_to_walk(struct scatterlist *sg, struct scatter_walk *walk) 38 + { 39 + struct scatterlist *src = walk->sg; 40 + int diff = walk->offset - src->offset; 41 + 42 + sg_set_page(sg, sg_page(src), 43 + src->length - diff, walk->offset); 44 + 45 + scatterwalk_crypto_chain(sg, sg_next(src), 0, 2); 46 + } 47 + 48 + static int tls_enc_record(struct aead_request *aead_req, 49 + struct crypto_aead *aead, char *aad, 50 + char *iv, __be64 rcd_sn, 51 + struct scatter_walk *in, 52 + struct scatter_walk *out, int *in_len) 53 + { 54 + unsigned char buf[TLS_HEADER_SIZE + TLS_CIPHER_AES_GCM_128_IV_SIZE]; 55 + struct scatterlist sg_in[3]; 56 + struct scatterlist sg_out[3]; 57 + u16 len; 58 + int rc; 59 + 60 + len = min_t(int, *in_len, ARRAY_SIZE(buf)); 61 + 62 + scatterwalk_copychunks(buf, in, len, 0); 63 + scatterwalk_copychunks(buf, out, len, 1); 64 + 65 + *in_len -= len; 66 + if (!*in_len) 67 + return 0; 68 + 69 + scatterwalk_pagedone(in, 0, 1); 70 + scatterwalk_pagedone(out, 1, 1); 71 + 72 + len = buf[4] | (buf[3] << 8); 73 + len -= TLS_CIPHER_AES_GCM_128_IV_SIZE; 74 + 75 + tls_make_aad(aad, len - TLS_CIPHER_AES_GCM_128_TAG_SIZE, 76 + (char *)&rcd_sn, sizeof(rcd_sn), buf[0]); 77 + 78 + memcpy(iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, buf + TLS_HEADER_SIZE, 79 + TLS_CIPHER_AES_GCM_128_IV_SIZE); 80 + 81 + sg_init_table(sg_in, ARRAY_SIZE(sg_in)); 82 + sg_init_table(sg_out, ARRAY_SIZE(sg_out)); 83 + sg_set_buf(sg_in, aad, TLS_AAD_SPACE_SIZE); 84 + sg_set_buf(sg_out, aad, TLS_AAD_SPACE_SIZE); 85 + chain_to_walk(sg_in + 1, in); 86 + chain_to_walk(sg_out + 1, out); 87 + 88 + *in_len -= len; 89 + if (*in_len < 0) { 90 + *in_len += TLS_CIPHER_AES_GCM_128_TAG_SIZE; 91 + /* the input buffer doesn't contain the entire record. 92 + * trim len accordingly. The resulting authentication tag 93 + * will contain garbage, but we don't care, so we won't 94 + * include any of it in the output skb 95 + * Note that we assume the output buffer length 96 + * is larger then input buffer length + tag size 97 + */ 98 + if (*in_len < 0) 99 + len += *in_len; 100 + 101 + *in_len = 0; 102 + } 103 + 104 + if (*in_len) { 105 + scatterwalk_copychunks(NULL, in, len, 2); 106 + scatterwalk_pagedone(in, 0, 1); 107 + scatterwalk_copychunks(NULL, out, len, 2); 108 + scatterwalk_pagedone(out, 1, 1); 109 + } 110 + 111 + len -= TLS_CIPHER_AES_GCM_128_TAG_SIZE; 112 + aead_request_set_crypt(aead_req, sg_in, sg_out, len, iv); 113 + 114 + rc = crypto_aead_encrypt(aead_req); 115 + 116 + return rc; 117 + } 118 + 119 + static void tls_init_aead_request(struct aead_request *aead_req, 120 + struct crypto_aead *aead) 121 + { 122 + aead_request_set_tfm(aead_req, aead); 123 + aead_request_set_ad(aead_req, TLS_AAD_SPACE_SIZE); 124 + } 125 + 126 + static struct aead_request *tls_alloc_aead_request(struct crypto_aead *aead, 127 + gfp_t flags) 128 + { 129 + unsigned int req_size = sizeof(struct aead_request) + 130 + crypto_aead_reqsize(aead); 131 + struct aead_request *aead_req; 132 + 133 + aead_req = kzalloc(req_size, flags); 134 + if (aead_req) 135 + tls_init_aead_request(aead_req, aead); 136 + return aead_req; 137 + } 138 + 139 + static int tls_enc_records(struct aead_request *aead_req, 140 + struct crypto_aead *aead, struct scatterlist *sg_in, 141 + struct scatterlist *sg_out, char *aad, char *iv, 142 + u64 rcd_sn, int len) 143 + { 144 + struct scatter_walk out, in; 145 + int rc; 146 + 147 + scatterwalk_start(&in, sg_in); 148 + scatterwalk_start(&out, sg_out); 149 + 150 + do { 151 + rc = tls_enc_record(aead_req, aead, aad, iv, 152 + cpu_to_be64(rcd_sn), &in, &out, &len); 153 + rcd_sn++; 154 + 155 + } while (rc == 0 && len); 156 + 157 + scatterwalk_done(&in, 0, 0); 158 + scatterwalk_done(&out, 1, 0); 159 + 160 + return rc; 161 + } 162 + 163 + /* Can't use icsk->icsk_af_ops->send_check here because the ip addresses 164 + * might have been changed by NAT. 165 + */ 166 + static void update_chksum(struct sk_buff *skb, int headln) 167 + { 168 + struct tcphdr *th = tcp_hdr(skb); 169 + int datalen = skb->len - headln; 170 + const struct ipv6hdr *ipv6h; 171 + const struct iphdr *iph; 172 + 173 + /* We only changed the payload so if we are using partial we don't 174 + * need to update anything. 175 + */ 176 + if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) 177 + return; 178 + 179 + skb->ip_summed = CHECKSUM_PARTIAL; 180 + skb->csum_start = skb_transport_header(skb) - skb->head; 181 + skb->csum_offset = offsetof(struct tcphdr, check); 182 + 183 + if (skb->sk->sk_family == AF_INET6) { 184 + ipv6h = ipv6_hdr(skb); 185 + th->check = ~csum_ipv6_magic(&ipv6h->saddr, &ipv6h->daddr, 186 + datalen, IPPROTO_TCP, 0); 187 + } else { 188 + iph = ip_hdr(skb); 189 + th->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, datalen, 190 + IPPROTO_TCP, 0); 191 + } 192 + } 193 + 194 + static void complete_skb(struct sk_buff *nskb, struct sk_buff *skb, int headln) 195 + { 196 + skb_copy_header(nskb, skb); 197 + 198 + skb_put(nskb, skb->len); 199 + memcpy(nskb->data, skb->data, headln); 200 + update_chksum(nskb, headln); 201 + 202 + nskb->destructor = skb->destructor; 203 + nskb->sk = skb->sk; 204 + skb->destructor = NULL; 205 + skb->sk = NULL; 206 + refcount_add(nskb->truesize - skb->truesize, 207 + &nskb->sk->sk_wmem_alloc); 208 + } 209 + 210 + /* This function may be called after the user socket is already 211 + * closed so make sure we don't use anything freed during 212 + * tls_sk_proto_close here 213 + */ 214 + 215 + static int fill_sg_in(struct scatterlist *sg_in, 216 + struct sk_buff *skb, 217 + struct tls_offload_context *ctx, 218 + u64 *rcd_sn, 219 + s32 *sync_size, 220 + int *resync_sgs) 221 + { 222 + int tcp_payload_offset = skb_transport_offset(skb) + tcp_hdrlen(skb); 223 + int payload_len = skb->len - tcp_payload_offset; 224 + u32 tcp_seq = ntohl(tcp_hdr(skb)->seq); 225 + struct tls_record_info *record; 226 + unsigned long flags; 227 + int remaining; 228 + int i; 229 + 230 + spin_lock_irqsave(&ctx->lock, flags); 231 + record = tls_get_record(ctx, tcp_seq, rcd_sn); 232 + if (!record) { 233 + spin_unlock_irqrestore(&ctx->lock, flags); 234 + WARN(1, "Record not found for seq %u\n", tcp_seq); 235 + return -EINVAL; 236 + } 237 + 238 + *sync_size = tcp_seq - tls_record_start_seq(record); 239 + if (*sync_size < 0) { 240 + int is_start_marker = tls_record_is_start_marker(record); 241 + 242 + spin_unlock_irqrestore(&ctx->lock, flags); 243 + /* This should only occur if the relevant record was 244 + * already acked. In that case it should be ok 245 + * to drop the packet and avoid retransmission. 246 + * 247 + * There is a corner case where the packet contains 248 + * both an acked and a non-acked record. 249 + * We currently don't handle that case and rely 250 + * on TCP to retranmit a packet that doesn't contain 251 + * already acked payload. 252 + */ 253 + if (!is_start_marker) 254 + *sync_size = 0; 255 + return -EINVAL; 256 + } 257 + 258 + remaining = *sync_size; 259 + for (i = 0; remaining > 0; i++) { 260 + skb_frag_t *frag = &record->frags[i]; 261 + 262 + __skb_frag_ref(frag); 263 + sg_set_page(sg_in + i, skb_frag_page(frag), 264 + skb_frag_size(frag), frag->page_offset); 265 + 266 + remaining -= skb_frag_size(frag); 267 + 268 + if (remaining < 0) 269 + sg_in[i].length += remaining; 270 + } 271 + *resync_sgs = i; 272 + 273 + spin_unlock_irqrestore(&ctx->lock, flags); 274 + if (skb_to_sgvec(skb, &sg_in[i], tcp_payload_offset, payload_len) < 0) 275 + return -EINVAL; 276 + 277 + return 0; 278 + } 279 + 280 + static void fill_sg_out(struct scatterlist sg_out[3], void *buf, 281 + struct tls_context *tls_ctx, 282 + struct sk_buff *nskb, 283 + int tcp_payload_offset, 284 + int payload_len, 285 + int sync_size, 286 + void *dummy_buf) 287 + { 288 + sg_set_buf(&sg_out[0], dummy_buf, sync_size); 289 + sg_set_buf(&sg_out[1], nskb->data + tcp_payload_offset, payload_len); 290 + /* Add room for authentication tag produced by crypto */ 291 + dummy_buf += sync_size; 292 + sg_set_buf(&sg_out[2], dummy_buf, TLS_CIPHER_AES_GCM_128_TAG_SIZE); 293 + } 294 + 295 + static struct sk_buff *tls_enc_skb(struct tls_context *tls_ctx, 296 + struct scatterlist sg_out[3], 297 + struct scatterlist *sg_in, 298 + struct sk_buff *skb, 299 + s32 sync_size, u64 rcd_sn) 300 + { 301 + int tcp_payload_offset = skb_transport_offset(skb) + tcp_hdrlen(skb); 302 + struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx); 303 + int payload_len = skb->len - tcp_payload_offset; 304 + void *buf, *iv, *aad, *dummy_buf; 305 + struct aead_request *aead_req; 306 + struct sk_buff *nskb = NULL; 307 + int buf_len; 308 + 309 + aead_req = tls_alloc_aead_request(ctx->aead_send, GFP_ATOMIC); 310 + if (!aead_req) 311 + return NULL; 312 + 313 + buf_len = TLS_CIPHER_AES_GCM_128_SALT_SIZE + 314 + TLS_CIPHER_AES_GCM_128_IV_SIZE + 315 + TLS_AAD_SPACE_SIZE + 316 + sync_size + 317 + TLS_CIPHER_AES_GCM_128_TAG_SIZE; 318 + buf = kmalloc(buf_len, GFP_ATOMIC); 319 + if (!buf) 320 + goto free_req; 321 + 322 + iv = buf; 323 + memcpy(iv, tls_ctx->crypto_send_aes_gcm_128.salt, 324 + TLS_CIPHER_AES_GCM_128_SALT_SIZE); 325 + aad = buf + TLS_CIPHER_AES_GCM_128_SALT_SIZE + 326 + TLS_CIPHER_AES_GCM_128_IV_SIZE; 327 + dummy_buf = aad + TLS_AAD_SPACE_SIZE; 328 + 329 + nskb = alloc_skb(skb_headroom(skb) + skb->len, GFP_ATOMIC); 330 + if (!nskb) 331 + goto free_buf; 332 + 333 + skb_reserve(nskb, skb_headroom(skb)); 334 + 335 + fill_sg_out(sg_out, buf, tls_ctx, nskb, tcp_payload_offset, 336 + payload_len, sync_size, dummy_buf); 337 + 338 + if (tls_enc_records(aead_req, ctx->aead_send, sg_in, sg_out, aad, iv, 339 + rcd_sn, sync_size + payload_len) < 0) 340 + goto free_nskb; 341 + 342 + complete_skb(nskb, skb, tcp_payload_offset); 343 + 344 + /* validate_xmit_skb_list assumes that if the skb wasn't segmented 345 + * nskb->prev will point to the skb itself 346 + */ 347 + nskb->prev = nskb; 348 + 349 + free_buf: 350 + kfree(buf); 351 + free_req: 352 + kfree(aead_req); 353 + return nskb; 354 + free_nskb: 355 + kfree_skb(nskb); 356 + nskb = NULL; 357 + goto free_buf; 358 + } 359 + 360 + static struct sk_buff *tls_sw_fallback(struct sock *sk, struct sk_buff *skb) 361 + { 362 + int tcp_payload_offset = skb_transport_offset(skb) + tcp_hdrlen(skb); 363 + struct tls_context *tls_ctx = tls_get_ctx(sk); 364 + struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx); 365 + int payload_len = skb->len - tcp_payload_offset; 366 + struct scatterlist *sg_in, sg_out[3]; 367 + struct sk_buff *nskb = NULL; 368 + int sg_in_max_elements; 369 + int resync_sgs = 0; 370 + s32 sync_size = 0; 371 + u64 rcd_sn; 372 + 373 + /* worst case is: 374 + * MAX_SKB_FRAGS in tls_record_info 375 + * MAX_SKB_FRAGS + 1 in SKB head and frags. 376 + */ 377 + sg_in_max_elements = 2 * MAX_SKB_FRAGS + 1; 378 + 379 + if (!payload_len) 380 + return skb; 381 + 382 + sg_in = kmalloc_array(sg_in_max_elements, sizeof(*sg_in), GFP_ATOMIC); 383 + if (!sg_in) 384 + goto free_orig; 385 + 386 + sg_init_table(sg_in, sg_in_max_elements); 387 + sg_init_table(sg_out, ARRAY_SIZE(sg_out)); 388 + 389 + if (fill_sg_in(sg_in, skb, ctx, &rcd_sn, &sync_size, &resync_sgs)) { 390 + /* bypass packets before kernel TLS socket option was set */ 391 + if (sync_size < 0 && payload_len <= -sync_size) 392 + nskb = skb_get(skb); 393 + goto put_sg; 394 + } 395 + 396 + nskb = tls_enc_skb(tls_ctx, sg_out, sg_in, skb, sync_size, rcd_sn); 397 + 398 + put_sg: 399 + while (resync_sgs) 400 + put_page(sg_page(&sg_in[--resync_sgs])); 401 + kfree(sg_in); 402 + free_orig: 403 + kfree_skb(skb); 404 + return nskb; 405 + } 406 + 407 + struct sk_buff *tls_validate_xmit_skb(struct sock *sk, 408 + struct net_device *dev, 409 + struct sk_buff *skb) 410 + { 411 + if (dev == tls_get_ctx(sk)->netdev) 412 + return skb; 413 + 414 + return tls_sw_fallback(sk, skb); 415 + } 416 + 417 + int tls_sw_fallback_init(struct sock *sk, 418 + struct tls_offload_context *offload_ctx, 419 + struct tls_crypto_info *crypto_info) 420 + { 421 + const u8 *key; 422 + int rc; 423 + 424 + offload_ctx->aead_send = 425 + crypto_alloc_aead("gcm(aes)", 0, CRYPTO_ALG_ASYNC); 426 + if (IS_ERR(offload_ctx->aead_send)) { 427 + rc = PTR_ERR(offload_ctx->aead_send); 428 + pr_err_ratelimited("crypto_alloc_aead failed rc=%d\n", rc); 429 + offload_ctx->aead_send = NULL; 430 + goto err_out; 431 + } 432 + 433 + key = ((struct tls12_crypto_info_aes_gcm_128 *)crypto_info)->key; 434 + 435 + rc = crypto_aead_setkey(offload_ctx->aead_send, key, 436 + TLS_CIPHER_AES_GCM_128_KEY_SIZE); 437 + if (rc) 438 + goto free_aead; 439 + 440 + rc = crypto_aead_setauthsize(offload_ctx->aead_send, 441 + TLS_CIPHER_AES_GCM_128_TAG_SIZE); 442 + if (rc) 443 + goto free_aead; 444 + 445 + return 0; 446 + free_aead: 447 + crypto_free_aead(offload_ctx->aead_send); 448 + err_out: 449 + return rc; 450 + }
+39 -3
net/tls/tls_main.c
··· 54 54 enum { 55 55 TLS_BASE, 56 56 TLS_SW, 57 + #ifdef CONFIG_TLS_DEVICE 58 + TLS_HW, 59 + #endif 57 60 TLS_HW_RECORD, 58 61 TLS_NUM_CONFIG, 59 62 }; ··· 283 280 tls_sw_free_resources_rx(sk); 284 281 } 285 282 283 + #ifdef CONFIG_TLS_DEVICE 284 + if (ctx->tx_conf != TLS_HW) { 285 + #else 286 + { 287 + #endif 288 + kfree(ctx); 289 + ctx = NULL; 290 + } 291 + 286 292 skip_tx_cleanup: 287 293 release_sock(sk); 288 294 sk_proto_close(sk, timeout); ··· 454 442 } 455 443 456 444 if (tx) { 457 - rc = tls_set_sw_offload(sk, ctx, 1); 458 - conf = TLS_SW; 445 + #ifdef CONFIG_TLS_DEVICE 446 + rc = tls_set_device_offload(sk, ctx); 447 + conf = TLS_HW; 448 + if (rc) { 449 + #else 450 + { 451 + #endif 452 + rc = tls_set_sw_offload(sk, ctx, 1); 453 + conf = TLS_SW; 454 + } 459 455 } else { 460 456 rc = tls_set_sw_offload(sk, ctx, 0); 461 457 conf = TLS_SW; ··· 616 596 prot[TLS_SW][TLS_SW].recvmsg = tls_sw_recvmsg; 617 597 prot[TLS_SW][TLS_SW].close = tls_sk_proto_close; 618 598 599 + #ifdef CONFIG_TLS_DEVICE 600 + prot[TLS_HW][TLS_BASE] = prot[TLS_BASE][TLS_BASE]; 601 + prot[TLS_HW][TLS_BASE].sendmsg = tls_device_sendmsg; 602 + prot[TLS_HW][TLS_BASE].sendpage = tls_device_sendpage; 603 + 604 + prot[TLS_HW][TLS_SW] = prot[TLS_BASE][TLS_SW]; 605 + prot[TLS_HW][TLS_SW].sendmsg = tls_device_sendmsg; 606 + prot[TLS_HW][TLS_SW].sendpage = tls_device_sendpage; 607 + #endif 608 + 619 609 prot[TLS_HW_RECORD][TLS_HW_RECORD] = *base; 620 610 prot[TLS_HW_RECORD][TLS_HW_RECORD].hash = tls_hw_hash; 621 611 prot[TLS_HW_RECORD][TLS_HW_RECORD].unhash = tls_hw_unhash; ··· 660 630 ctx->getsockopt = sk->sk_prot->getsockopt; 661 631 ctx->sk_proto_close = sk->sk_prot->close; 662 632 663 - /* Build IPv6 TLS whenever the address of tcpv6_prot changes */ 633 + /* Build IPv6 TLS whenever the address of tcpv6 _prot changes */ 664 634 if (ip_ver == TLSV6 && 665 635 unlikely(sk->sk_prot != smp_load_acquire(&saved_tcpv6_prot))) { 666 636 mutex_lock(&tcpv6_prot_mutex); ··· 710 680 tls_sw_proto_ops.poll = tls_sw_poll; 711 681 tls_sw_proto_ops.splice_read = tls_sw_splice_read; 712 682 683 + #ifdef CONFIG_TLS_DEVICE 684 + tls_device_init(); 685 + #endif 713 686 tcp_register_ulp(&tcp_tls_ulp_ops); 714 687 715 688 return 0; ··· 721 688 static void __exit tls_unregister(void) 722 689 { 723 690 tcp_unregister_ulp(&tcp_tls_ulp_ops); 691 + #ifdef CONFIG_TLS_DEVICE 692 + tls_device_cleanup(); 693 + #endif 724 694 } 725 695 726 696 module_init(tls_register);