Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux

mm/page_owner: keep track of page owners

This is the page owner tracking code which is introduced so far ago. It
is resident on Andrew's tree, though, nobody tried to upstream so it
remain as is. Our company uses this feature actively to debug memory leak
or to find a memory hogger so I decide to upstream this feature.

This functionality help us to know who allocates the page. When
allocating a page, we store some information about allocation in extra
memory. Later, if we need to know status of all pages, we can get and
analyze it from this stored information.

In previous version of this feature, extra memory is statically defined in
struct page, but, in this version, extra memory is allocated outside of
struct page. It enables us to turn on/off this feature at boottime
without considerable memory waste.

Although we already have tracepoint for tracing page allocation/free,
using it to analyze page owner is rather complex. We need to enlarge the
trace buffer for preventing overlapping until userspace program launched.
And, launched program continually dump out the trace buffer for later
analysis and it would change system behaviour with more possibility rather
than just keeping it in memory, so bad for debug.

Moreover, we can use page_owner feature further for various purposes. For
example, we can use it for fragmentation statistics implemented in this
patch. And, I also plan to implement some CMA failure debugging feature
using this interface.

I'd like to give the credit for all developers contributed this feature,
but, it's not easy because I don't know exact history. Sorry about that.
Below is people who has "Signed-off-by" in the patches in Andrew's tree.

Contributor:
Alexander Nyberg <alexn@dsv.su.se>
Mel Gorman <mgorman@suse.de>
Dave Hansen <dave@linux.vnet.ibm.com>
Minchan Kim <minchan@kernel.org>
Michal Nazarewicz <mina86@mina86.com>
Andrew Morton <akpm@linux-foundation.org>
Jungsoo Son <jungsoo.son@lge.com>

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Jungsoo Son <jungsoo.son@lge.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Joonsoo Kim and committed by
Linus Torvalds
48c96a36 9a92a6ce

+554 -3
+6
Documentation/kernel-parameters.txt
··· 2513 2513 OSS [HW,OSS] 2514 2514 See Documentation/sound/oss/oss-parameters.txt 2515 2515 2516 + page_owner= [KNL] Boot-time page_owner enabling option. 2517 + Storage of the information about who allocated 2518 + each page is disabled in default. With this switch, 2519 + we can turn it on. 2520 + on: enable the feature 2521 + 2516 2522 panic= [KNL] Kernel behaviour on panic: delay <timeout> 2517 2523 timeout > 0: seconds before rebooting 2518 2524 timeout = 0: wait forever
+10
include/linux/page_ext.h
··· 1 1 #ifndef __LINUX_PAGE_EXT_H 2 2 #define __LINUX_PAGE_EXT_H 3 3 4 + #include <linux/types.h> 5 + #include <linux/stacktrace.h> 6 + 4 7 struct pglist_data; 5 8 struct page_ext_operations { 6 9 bool (*need)(void); ··· 25 22 enum page_ext_flags { 26 23 PAGE_EXT_DEBUG_POISON, /* Page is poisoned */ 27 24 PAGE_EXT_DEBUG_GUARD, 25 + PAGE_EXT_OWNER, 28 26 }; 29 27 30 28 /* ··· 37 33 */ 38 34 struct page_ext { 39 35 unsigned long flags; 36 + #ifdef CONFIG_PAGE_OWNER 37 + unsigned int order; 38 + gfp_t gfp_mask; 39 + struct stack_trace trace; 40 + unsigned long trace_entries[8]; 41 + #endif 40 42 }; 41 43 42 44 extern void pgdat_page_ext_init(struct pglist_data *pgdat);
+38
include/linux/page_owner.h
··· 1 + #ifndef __LINUX_PAGE_OWNER_H 2 + #define __LINUX_PAGE_OWNER_H 3 + 4 + #ifdef CONFIG_PAGE_OWNER 5 + extern bool page_owner_inited; 6 + extern struct page_ext_operations page_owner_ops; 7 + 8 + extern void __reset_page_owner(struct page *page, unsigned int order); 9 + extern void __set_page_owner(struct page *page, 10 + unsigned int order, gfp_t gfp_mask); 11 + 12 + static inline void reset_page_owner(struct page *page, unsigned int order) 13 + { 14 + if (likely(!page_owner_inited)) 15 + return; 16 + 17 + __reset_page_owner(page, order); 18 + } 19 + 20 + static inline void set_page_owner(struct page *page, 21 + unsigned int order, gfp_t gfp_mask) 22 + { 23 + if (likely(!page_owner_inited)) 24 + return; 25 + 26 + __set_page_owner(page, order, gfp_mask); 27 + } 28 + #else 29 + static inline void reset_page_owner(struct page *page, unsigned int order) 30 + { 31 + } 32 + static inline void set_page_owner(struct page *page, 33 + unsigned int order, gfp_t gfp_mask) 34 + { 35 + } 36 + 37 + #endif /* CONFIG_PAGE_OWNER */ 38 + #endif /* __LINUX_PAGE_OWNER_H */
+16
lib/Kconfig.debug
··· 227 227 you really need it, and what the merge plan to the mainline kernel for 228 228 your module is. 229 229 230 + config PAGE_OWNER 231 + bool "Track page owner" 232 + depends on DEBUG_KERNEL && STACKTRACE_SUPPORT 233 + select DEBUG_FS 234 + select STACKTRACE 235 + select PAGE_EXTENSION 236 + help 237 + This keeps track of what call chain is the owner of a page, may 238 + help to find bare alloc_page(s) leaks. Even if you include this 239 + feature on your build, it is disabled in default. You should pass 240 + "page_owner=on" to boot parameter in order to enable it. Eats 241 + a fair amount of memory if enabled. See tools/vm/page_owner_sort.c 242 + for user-space helper. 243 + 244 + If unsure, say N. 245 + 230 246 config DEBUG_FS 231 247 bool "Debug Filesystem" 232 248 help
+1
mm/Makefile
··· 63 63 obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o 64 64 obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o 65 65 obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o 66 + obj-$(CONFIG_PAGE_OWNER) += page_owner.o 66 67 obj-$(CONFIG_CLEANCACHE) += cleancache.o 67 68 obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o 68 69 obj-$(CONFIG_ZPOOL) += zpool.o
+10 -1
mm/page_alloc.c
··· 59 59 #include <linux/page_ext.h> 60 60 #include <linux/hugetlb.h> 61 61 #include <linux/sched/rt.h> 62 + #include <linux/page_owner.h> 62 63 63 64 #include <asm/sections.h> 64 65 #include <asm/tlbflush.h> ··· 814 813 if (bad) 815 814 return false; 816 815 816 + reset_page_owner(page, order); 817 + 817 818 if (!PageHighMem(page)) { 818 819 debug_check_no_locks_freed(page_address(page), 819 820 PAGE_SIZE << order); ··· 990 987 991 988 if (order && (gfp_flags & __GFP_COMP)) 992 989 prep_compound_page(page, order); 990 + 991 + set_page_owner(page, order, gfp_flags); 993 992 994 993 return 0; 995 994 } ··· 1565 1560 split_page(virt_to_page(page[0].shadow), order); 1566 1561 #endif 1567 1562 1568 - for (i = 1; i < (1 << order); i++) 1563 + set_page_owner(page, 0, 0); 1564 + for (i = 1; i < (1 << order); i++) { 1569 1565 set_page_refcounted(page + i); 1566 + set_page_owner(page + i, 0, 0); 1567 + } 1570 1568 } 1571 1569 EXPORT_SYMBOL_GPL(split_page); 1572 1570 ··· 1609 1601 } 1610 1602 } 1611 1603 1604 + set_page_owner(page, order, 0); 1612 1605 return 1UL << order; 1613 1606 } 1614 1607
+4
mm/page_ext.c
··· 5 5 #include <linux/memory.h> 6 6 #include <linux/vmalloc.h> 7 7 #include <linux/kmemleak.h> 8 + #include <linux/page_owner.h> 8 9 9 10 /* 10 11 * struct page extension ··· 55 54 &debug_guardpage_ops, 56 55 #ifdef CONFIG_PAGE_POISONING 57 56 &page_poisoning_ops, 57 + #endif 58 + #ifdef CONFIG_PAGE_OWNER 59 + &page_owner_ops, 58 60 #endif 59 61 }; 60 62
+222
mm/page_owner.c
··· 1 + #include <linux/debugfs.h> 2 + #include <linux/mm.h> 3 + #include <linux/slab.h> 4 + #include <linux/uaccess.h> 5 + #include <linux/bootmem.h> 6 + #include <linux/stacktrace.h> 7 + #include <linux/page_owner.h> 8 + #include "internal.h" 9 + 10 + static bool page_owner_disabled = true; 11 + bool page_owner_inited __read_mostly; 12 + 13 + static int early_page_owner_param(char *buf) 14 + { 15 + if (!buf) 16 + return -EINVAL; 17 + 18 + if (strcmp(buf, "on") == 0) 19 + page_owner_disabled = false; 20 + 21 + return 0; 22 + } 23 + early_param("page_owner", early_page_owner_param); 24 + 25 + static bool need_page_owner(void) 26 + { 27 + if (page_owner_disabled) 28 + return false; 29 + 30 + return true; 31 + } 32 + 33 + static void init_page_owner(void) 34 + { 35 + if (page_owner_disabled) 36 + return; 37 + 38 + page_owner_inited = true; 39 + } 40 + 41 + struct page_ext_operations page_owner_ops = { 42 + .need = need_page_owner, 43 + .init = init_page_owner, 44 + }; 45 + 46 + void __reset_page_owner(struct page *page, unsigned int order) 47 + { 48 + int i; 49 + struct page_ext *page_ext; 50 + 51 + for (i = 0; i < (1 << order); i++) { 52 + page_ext = lookup_page_ext(page + i); 53 + __clear_bit(PAGE_EXT_OWNER, &page_ext->flags); 54 + } 55 + } 56 + 57 + void __set_page_owner(struct page *page, unsigned int order, gfp_t gfp_mask) 58 + { 59 + struct page_ext *page_ext; 60 + struct stack_trace *trace; 61 + 62 + page_ext = lookup_page_ext(page); 63 + 64 + trace = &page_ext->trace; 65 + trace->nr_entries = 0; 66 + trace->max_entries = ARRAY_SIZE(page_ext->trace_entries); 67 + trace->entries = &page_ext->trace_entries[0]; 68 + trace->skip = 3; 69 + save_stack_trace(&page_ext->trace); 70 + 71 + page_ext->order = order; 72 + page_ext->gfp_mask = gfp_mask; 73 + 74 + __set_bit(PAGE_EXT_OWNER, &page_ext->flags); 75 + } 76 + 77 + static ssize_t 78 + print_page_owner(char __user *buf, size_t count, unsigned long pfn, 79 + struct page *page, struct page_ext *page_ext) 80 + { 81 + int ret; 82 + int pageblock_mt, page_mt; 83 + char *kbuf; 84 + 85 + kbuf = kmalloc(count, GFP_KERNEL); 86 + if (!kbuf) 87 + return -ENOMEM; 88 + 89 + ret = snprintf(kbuf, count, 90 + "Page allocated via order %u, mask 0x%x\n", 91 + page_ext->order, page_ext->gfp_mask); 92 + 93 + if (ret >= count) 94 + goto err; 95 + 96 + /* Print information relevant to grouping pages by mobility */ 97 + pageblock_mt = get_pfnblock_migratetype(page, pfn); 98 + page_mt = gfpflags_to_migratetype(page_ext->gfp_mask); 99 + ret += snprintf(kbuf + ret, count - ret, 100 + "PFN %lu Block %lu type %d %s Flags %s%s%s%s%s%s%s%s%s%s%s%s\n", 101 + pfn, 102 + pfn >> pageblock_order, 103 + pageblock_mt, 104 + pageblock_mt != page_mt ? "Fallback" : " ", 105 + PageLocked(page) ? "K" : " ", 106 + PageError(page) ? "E" : " ", 107 + PageReferenced(page) ? "R" : " ", 108 + PageUptodate(page) ? "U" : " ", 109 + PageDirty(page) ? "D" : " ", 110 + PageLRU(page) ? "L" : " ", 111 + PageActive(page) ? "A" : " ", 112 + PageSlab(page) ? "S" : " ", 113 + PageWriteback(page) ? "W" : " ", 114 + PageCompound(page) ? "C" : " ", 115 + PageSwapCache(page) ? "B" : " ", 116 + PageMappedToDisk(page) ? "M" : " "); 117 + 118 + if (ret >= count) 119 + goto err; 120 + 121 + ret += snprint_stack_trace(kbuf + ret, count - ret, 122 + &page_ext->trace, 0); 123 + if (ret >= count) 124 + goto err; 125 + 126 + ret += snprintf(kbuf + ret, count - ret, "\n"); 127 + if (ret >= count) 128 + goto err; 129 + 130 + if (copy_to_user(buf, kbuf, ret)) 131 + ret = -EFAULT; 132 + 133 + kfree(kbuf); 134 + return ret; 135 + 136 + err: 137 + kfree(kbuf); 138 + return -ENOMEM; 139 + } 140 + 141 + static ssize_t 142 + read_page_owner(struct file *file, char __user *buf, size_t count, loff_t *ppos) 143 + { 144 + unsigned long pfn; 145 + struct page *page; 146 + struct page_ext *page_ext; 147 + 148 + if (!page_owner_inited) 149 + return -EINVAL; 150 + 151 + page = NULL; 152 + pfn = min_low_pfn + *ppos; 153 + 154 + /* Find a valid PFN or the start of a MAX_ORDER_NR_PAGES area */ 155 + while (!pfn_valid(pfn) && (pfn & (MAX_ORDER_NR_PAGES - 1)) != 0) 156 + pfn++; 157 + 158 + drain_all_pages(NULL); 159 + 160 + /* Find an allocated page */ 161 + for (; pfn < max_pfn; pfn++) { 162 + /* 163 + * If the new page is in a new MAX_ORDER_NR_PAGES area, 164 + * validate the area as existing, skip it if not 165 + */ 166 + if ((pfn & (MAX_ORDER_NR_PAGES - 1)) == 0 && !pfn_valid(pfn)) { 167 + pfn += MAX_ORDER_NR_PAGES - 1; 168 + continue; 169 + } 170 + 171 + /* Check for holes within a MAX_ORDER area */ 172 + if (!pfn_valid_within(pfn)) 173 + continue; 174 + 175 + page = pfn_to_page(pfn); 176 + if (PageBuddy(page)) { 177 + unsigned long freepage_order = page_order_unsafe(page); 178 + 179 + if (freepage_order < MAX_ORDER) 180 + pfn += (1UL << freepage_order) - 1; 181 + continue; 182 + } 183 + 184 + page_ext = lookup_page_ext(page); 185 + 186 + /* 187 + * Pages allocated before initialization of page_owner are 188 + * non-buddy and have no page_owner info. 189 + */ 190 + if (!test_bit(PAGE_EXT_OWNER, &page_ext->flags)) 191 + continue; 192 + 193 + /* Record the next PFN to read in the file offset */ 194 + *ppos = (pfn - min_low_pfn) + 1; 195 + 196 + return print_page_owner(buf, count, pfn, page, page_ext); 197 + } 198 + 199 + return 0; 200 + } 201 + 202 + static const struct file_operations proc_page_owner_operations = { 203 + .read = read_page_owner, 204 + }; 205 + 206 + static int __init pageowner_init(void) 207 + { 208 + struct dentry *dentry; 209 + 210 + if (!page_owner_inited) { 211 + pr_info("page_owner is disabled\n"); 212 + return 0; 213 + } 214 + 215 + dentry = debugfs_create_file("page_owner", S_IRUSR, NULL, 216 + NULL, &proc_page_owner_operations); 217 + if (IS_ERR(dentry)) 218 + return PTR_ERR(dentry); 219 + 220 + return 0; 221 + } 222 + module_init(pageowner_init)
+101
mm/vmstat.c
··· 22 22 #include <linux/writeback.h> 23 23 #include <linux/compaction.h> 24 24 #include <linux/mm_inline.h> 25 + #include <linux/page_ext.h> 26 + #include <linux/page_owner.h> 25 27 26 28 #include "internal.h" 27 29 ··· 1019 1017 return 0; 1020 1018 } 1021 1019 1020 + #ifdef CONFIG_PAGE_OWNER 1021 + static void pagetypeinfo_showmixedcount_print(struct seq_file *m, 1022 + pg_data_t *pgdat, 1023 + struct zone *zone) 1024 + { 1025 + struct page *page; 1026 + struct page_ext *page_ext; 1027 + unsigned long pfn = zone->zone_start_pfn, block_end_pfn; 1028 + unsigned long end_pfn = pfn + zone->spanned_pages; 1029 + unsigned long count[MIGRATE_TYPES] = { 0, }; 1030 + int pageblock_mt, page_mt; 1031 + int i; 1032 + 1033 + /* Scan block by block. First and last block may be incomplete */ 1034 + pfn = zone->zone_start_pfn; 1035 + 1036 + /* 1037 + * Walk the zone in pageblock_nr_pages steps. If a page block spans 1038 + * a zone boundary, it will be double counted between zones. This does 1039 + * not matter as the mixed block count will still be correct 1040 + */ 1041 + for (; pfn < end_pfn; ) { 1042 + if (!pfn_valid(pfn)) { 1043 + pfn = ALIGN(pfn + 1, MAX_ORDER_NR_PAGES); 1044 + continue; 1045 + } 1046 + 1047 + block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages); 1048 + block_end_pfn = min(block_end_pfn, end_pfn); 1049 + 1050 + page = pfn_to_page(pfn); 1051 + pageblock_mt = get_pfnblock_migratetype(page, pfn); 1052 + 1053 + for (; pfn < block_end_pfn; pfn++) { 1054 + if (!pfn_valid_within(pfn)) 1055 + continue; 1056 + 1057 + page = pfn_to_page(pfn); 1058 + if (PageBuddy(page)) { 1059 + pfn += (1UL << page_order(page)) - 1; 1060 + continue; 1061 + } 1062 + 1063 + if (PageReserved(page)) 1064 + continue; 1065 + 1066 + page_ext = lookup_page_ext(page); 1067 + 1068 + if (!test_bit(PAGE_EXT_OWNER, &page_ext->flags)) 1069 + continue; 1070 + 1071 + page_mt = gfpflags_to_migratetype(page_ext->gfp_mask); 1072 + if (pageblock_mt != page_mt) { 1073 + if (is_migrate_cma(pageblock_mt)) 1074 + count[MIGRATE_MOVABLE]++; 1075 + else 1076 + count[pageblock_mt]++; 1077 + 1078 + pfn = block_end_pfn; 1079 + break; 1080 + } 1081 + pfn += (1UL << page_ext->order) - 1; 1082 + } 1083 + } 1084 + 1085 + /* Print counts */ 1086 + seq_printf(m, "Node %d, zone %8s ", pgdat->node_id, zone->name); 1087 + for (i = 0; i < MIGRATE_TYPES; i++) 1088 + seq_printf(m, "%12lu ", count[i]); 1089 + seq_putc(m, '\n'); 1090 + } 1091 + #endif /* CONFIG_PAGE_OWNER */ 1092 + 1093 + /* 1094 + * Print out the number of pageblocks for each migratetype that contain pages 1095 + * of other types. This gives an indication of how well fallbacks are being 1096 + * contained by rmqueue_fallback(). It requires information from PAGE_OWNER 1097 + * to determine what is going on 1098 + */ 1099 + static void pagetypeinfo_showmixedcount(struct seq_file *m, pg_data_t *pgdat) 1100 + { 1101 + #ifdef CONFIG_PAGE_OWNER 1102 + int mtype; 1103 + 1104 + if (!page_owner_inited) 1105 + return; 1106 + 1107 + drain_all_pages(NULL); 1108 + 1109 + seq_printf(m, "\n%-23s", "Number of mixed blocks "); 1110 + for (mtype = 0; mtype < MIGRATE_TYPES; mtype++) 1111 + seq_printf(m, "%12s ", migratetype_names[mtype]); 1112 + seq_putc(m, '\n'); 1113 + 1114 + walk_zones_in_node(m, pgdat, pagetypeinfo_showmixedcount_print); 1115 + #endif /* CONFIG_PAGE_OWNER */ 1116 + } 1117 + 1022 1118 /* 1023 1119 * This prints out statistics in relation to grouping pages by mobility. 1024 1120 * It is expensive to collect so do not constantly read the file. ··· 1134 1034 seq_putc(m, '\n'); 1135 1035 pagetypeinfo_showfree(m, pgdat); 1136 1036 pagetypeinfo_showblockcount(m, pgdat); 1037 + pagetypeinfo_showmixedcount(m, pgdat); 1137 1038 1138 1039 return 0; 1139 1040 }
+2 -2
tools/vm/Makefile
··· 1 1 # Makefile for vm tools 2 2 # 3 - TARGETS=page-types slabinfo 3 + TARGETS=page-types slabinfo page_owner_sort 4 4 5 5 LIB_DIR = ../lib/api 6 6 LIBS = $(LIB_DIR)/libapikfs.a ··· 18 18 $(CC) $(CFLAGS) -o $@ $< $(LDFLAGS) 19 19 20 20 clean: 21 - $(RM) page-types slabinfo 21 + $(RM) page-types slabinfo page_owner_sort 22 22 make -C $(LIB_DIR) clean
+144
tools/vm/page_owner_sort.c
··· 1 + /* 2 + * User-space helper to sort the output of /sys/kernel/debug/page_owner 3 + * 4 + * Example use: 5 + * cat /sys/kernel/debug/page_owner > page_owner_full.txt 6 + * grep -v ^PFN page_owner_full.txt > page_owner.txt 7 + * ./sort page_owner.txt sorted_page_owner.txt 8 + */ 9 + 10 + #include <stdio.h> 11 + #include <stdlib.h> 12 + #include <sys/types.h> 13 + #include <sys/stat.h> 14 + #include <fcntl.h> 15 + #include <unistd.h> 16 + #include <string.h> 17 + 18 + struct block_list { 19 + char *txt; 20 + int len; 21 + int num; 22 + }; 23 + 24 + 25 + static struct block_list *list; 26 + static int list_size; 27 + static int max_size; 28 + 29 + struct block_list *block_head; 30 + 31 + int read_block(char *buf, int buf_size, FILE *fin) 32 + { 33 + char *curr = buf, *const buf_end = buf + buf_size; 34 + 35 + while (buf_end - curr > 1 && fgets(curr, buf_end - curr, fin)) { 36 + if (*curr == '\n') /* empty line */ 37 + return curr - buf; 38 + curr += strlen(curr); 39 + } 40 + 41 + return -1; /* EOF or no space left in buf. */ 42 + } 43 + 44 + static int compare_txt(const void *p1, const void *p2) 45 + { 46 + const struct block_list *l1 = p1, *l2 = p2; 47 + 48 + return strcmp(l1->txt, l2->txt); 49 + } 50 + 51 + static int compare_num(const void *p1, const void *p2) 52 + { 53 + const struct block_list *l1 = p1, *l2 = p2; 54 + 55 + return l2->num - l1->num; 56 + } 57 + 58 + static void add_list(char *buf, int len) 59 + { 60 + if (list_size != 0 && 61 + len == list[list_size-1].len && 62 + memcmp(buf, list[list_size-1].txt, len) == 0) { 63 + list[list_size-1].num++; 64 + return; 65 + } 66 + if (list_size == max_size) { 67 + printf("max_size too small??\n"); 68 + exit(1); 69 + } 70 + list[list_size].txt = malloc(len+1); 71 + list[list_size].len = len; 72 + list[list_size].num = 1; 73 + memcpy(list[list_size].txt, buf, len); 74 + list[list_size].txt[len] = 0; 75 + list_size++; 76 + if (list_size % 1000 == 0) { 77 + printf("loaded %d\r", list_size); 78 + fflush(stdout); 79 + } 80 + } 81 + 82 + #define BUF_SIZE 1024 83 + 84 + int main(int argc, char **argv) 85 + { 86 + FILE *fin, *fout; 87 + char buf[BUF_SIZE]; 88 + int ret, i, count; 89 + struct block_list *list2; 90 + struct stat st; 91 + 92 + if (argc < 3) { 93 + printf("Usage: ./program <input> <output>\n"); 94 + perror("open: "); 95 + exit(1); 96 + } 97 + 98 + fin = fopen(argv[1], "r"); 99 + fout = fopen(argv[2], "w"); 100 + if (!fin || !fout) { 101 + printf("Usage: ./program <input> <output>\n"); 102 + perror("open: "); 103 + exit(1); 104 + } 105 + 106 + fstat(fileno(fin), &st); 107 + max_size = st.st_size / 100; /* hack ... */ 108 + 109 + list = malloc(max_size * sizeof(*list)); 110 + 111 + for ( ; ; ) { 112 + ret = read_block(buf, BUF_SIZE, fin); 113 + if (ret < 0) 114 + break; 115 + 116 + add_list(buf, ret); 117 + } 118 + 119 + printf("loaded %d\n", list_size); 120 + 121 + printf("sorting ....\n"); 122 + 123 + qsort(list, list_size, sizeof(list[0]), compare_txt); 124 + 125 + list2 = malloc(sizeof(*list) * list_size); 126 + 127 + printf("culling\n"); 128 + 129 + for (i = count = 0; i < list_size; i++) { 130 + if (count == 0 || 131 + strcmp(list2[count-1].txt, list[i].txt) != 0) { 132 + list2[count++] = list[i]; 133 + } else { 134 + list2[count-1].num += list[i].num; 135 + } 136 + } 137 + 138 + qsort(list2, count, sizeof(list[0]), compare_num); 139 + 140 + for (i = 0; i < count; i++) 141 + fprintf(fout, "%d times:\n%s\n", list2[i].num, list2[i].txt); 142 + 143 + return 0; 144 + }